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TECHNICAL FIELD 

The invention relates to novel nucleotide sequences 
located in a gene which encodes a bacterial f lagellin 
antigen, and the use of those nucleotide ^sequences for the 
detection of bacteria which express particular flagellin 
antigens, on the basis of that antigen alone, or in 
conjunction with the O antigen expressed by that strain. 

BACKGROUND ART 

The flagellum of many bacteria appears to.be made up of a 
single protein known as flagellin. The serotyping schemes 
of E. coll and Salmonella, enterica. are based on highly 
variable antigenic surface structures which include the 
lipopolysaccharide which carries the O antigen and 
flagellin which is now known to be the carrier of the 
classical H antigen. In many strains of S. enterica. there 
are two loci {fliC and fljB) which encode flagellin, and a 
regulatory system which allows one only to be expressed at 
any time; and which also provides for expression to 
rapidly alternate between the two forms first identified 
as two phases (HI and H2 ) for the H antigen of most 
strains. In E. coli there are 54 forms of H antigen 
recognised and until recently they were all thought to be 
encoded at the fliC locus, as has been * shown for E. coli 
K-12. However in the 1980s Ratiner [Ratiner Y A '^Phase 
variation of the H antigen in Escherichia coli strain 
Bi327-41, the standard strain for Escherichia coli 
flagellin antigen H3" FEMS Microbiol. Lett 15 (1982) 33- 
36; Ratiner Y A "Presence of two structural genes 
determining antigenically different phase-specific 
f lagellins in some Escherichia coli strains" FEMS 
Microbiol. Lett. 19 (1983) 37-41; Ratiner Y A ^^Two genetic 
arrangements determining flagellin antigen specificities 
in two diphasic Escherichia coli strains" FEMS Microbiol. 
Lett. 29 ,(1985) 317-323; Ratiner Y A "Different alleles of 
the flagellin gene hagB in Escherichia coli standard H 



# 
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test strains" FEMS Microbiol Lett. 48 (1987) 97-104.] 
showed that in some cases there are two loci and that 
expression can alternate. The matter was further 
complicated by a recent paper by Ratiner [Ratiner Y A 
5 (1998) "New f lagellin-specifying genes in some Escherichia 

coli strains" J. Bacterid. 180 979-984] showing three 
loci (flic, fll and flm) for flagellin in addition to fliC 
although the fljB locus has not been found in E. coli. 
However E, coli strains are normally identified by the 

10 combination of one O antigen and one H antigen [and K 

antigen when present as a capsule (K) antigen] , with no 
problems reported for the vast majority of cases with 
alternate phases, while enterica strains are normally 

identified by the combination of O, HI and H2 antigens. It 

15 is still not clear how widespread in E. coli H antigens 

determined by flagellin genes other than fliC are. 

Typing is typically carried out using specific 
antisera. The incidence of pathogenic E. coli in 

association with human and animal disease supports the 

20 need for suitable and rapid typing tecliniques. 



DESCRIPTION OF THE INVENTION 

In a first aspect, the present invention provides a 
novel nucleic acid molecule encoding all or part of an 

25 coli flagellin protein. 

The present invention provides, for the first time, 
full length sequence for a flagellin gene for the 
following E. coli type strains: H6 (SEQ ID NO: 8), H9(SEQ 
ID NO: 11), H10(SEQ ID NO: 12), H14(SEQ ID NO: 15), 

30 H18(SEQ ID NO: 18), H23 (SEQ ID NO: 22), H51(SEQ ID NO: 

50), H45(SEQ ID NO: 43), H49 (SEQ ID NO: 48), H19{SEQ ID 
NO: 19), HBO (SEQ ID NO: 29), H32(SEQ ID NO: 31), H26{SEQ 
ID NO: 25), H41(SEQ ID NO: 39), H15(SEQ ID NO: 16), 
H20(SEQ ID NO: 20), H28 (SEQ ID NO: 27), H46(SEQ ID NO: 

35 44), H31(SEQ ID NO: 30), H34(SEQ ID NO: 33), H43 (SEQ ID 

NO: 41) and H52(SEQ ID NO: 51). Corrected full length 
sequences have been obtained for H7(SEQ ID NO: 9) and 
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H12(SEQ ID NO: 14) type strains. 

Partial flagellin gene sequence, including the 
central variable region, has been obtained for the 
following E. coli H type strains: H40(SEQ ID NO: 38), 
H8(SEQ ID NO: 10), H21(SEQ ID NO: 21), H47 (SEQ ID NO: 46), 
HIKSEQ ID NO: 13), Hi 7 (SEQ ID NO: 17), H25(SEQ ID NO: 
24), H42(SEQ ID NO: 40), H27(SEQ ID NO: 26), H35(SEQ ID 
NO: 34), H2(SEQ ID NO: 67), H3 (SEQ ID NO: 68), H24 (SEQ ID 
NO: 23), H37(SEQ ID NO: 35), HBO (SEQ ID NO: 49), H4(SEQ ID 
NO: 6), H44(SEQ ID NO: 42), H38(SEQ ID NO: 36), H39(SEQ ID 
NO: 37), H55(SEQ ID NO : 53), H29 (SEQ ID NO: 28), H33 (SEQ 
ID NO: 32), H5(SEQ ID NO: 7), H54(SEQ ID NO: 52) and 
H56 (SEQ ID NO: 54) . 

Comparison of sequences demonstrates that unique 
flagellin genes have now been sequenced (partially or 
completely) for the following E. coli H type strains: Hi, 
H2, H3, H5, H6, H7 , H9, Hll , H12 , H14, H15 , H18, H19, H20, 
H21, H23, H24, H25, H26, H27, H28, H29, H30, H31. H32, 
H33, H34, H35, H37, H38, H39, H41, H42, H43 , H45, H46, 
H48, H49, H51, H52, H54, and H56 and either H8 or H40. HIO 
or H50 and H4 or H17 . 

By comparison of these sequences, the present 
inventors were able to identify specific sequences for 
each of the above H serotypes. 

The present invention also provides fliC sequences 
from 10 different H7 strains, in addition to that from the 
H7 type strain, and two sequences specific to H7 of 0157 
and 055 E. coli strains. 

The present invention encompasses all or part of the 
flagellin genes sequenced for H2, H3 , H5, H6, H9, Hll, 
H14, H18, H19, H20, H21, H23, H24, H25, H26, H27, H28, 
H29, H30, H31, H32, H33, H34, H35, H37 , H38, H39, H41, 
H42, H43, H44, H45, H46, H47 , H48, H49, H51, H52, H54, 
H55, H56, H8, H40, H15, HIO, or HBO, H4 and H17 type 
strains. Of these flagellin genes sequenced, those from 
the type strains for H8 and H40 are identical, those from 
type strains HlO and H50, HI and H12, H38 and H55, H21 and 
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H47, and H4 , H17 and H44 type strains are highly similar. 

The invention also encompasses newly provided 
sequence for H7 and H12 as well as novel primers for the 
specific amplification of Hi, HI. H12 and H48 as well as 
for the other above mentioned newly sequenced flagellin 
genes . 

By cloning and expression of these sequenced 
flagellin genes in a fliC deletion E. coli K-12 strain, 
and use of anti-H antiserum, we have confirmed the H 
specificities encoded by 39 falgellin genes. The 39 H 
specificities are HI, H2, H4, H5, H6 , H7 , H9, HlO, Hll, 
H12, H14, H15, H16, H18, H19, H20, H21, H23 , H24, H26, 
H27, H28, H29, H30, H31, H32, H33, H34, H38, H39, H41, 
H42, H43, H45, H46, H49 , H51, H52 , and H56, encoded by 
flagellin genes obtained from H type strains for Hi, H2, 
H4, H5, H6, H7, H9 , HlO, Hll, H12 , H14, HIS, H3 , H18 , H19, 
H20, H21, H23, H24, H26, H27 . H28, H29, H30, H31, H32, 
H33, H34, H38, H39, H41, H42 , H43 , H45, H46, H49 , H51, 
H52, and H56 respectively. 

The nucleic acid molecules of the invention may be 
variable in length. In one embodiment they are 

oligonucleotides of from about 10 to about 20 nucleotides 
in length. The oligonucleotides of the invention are 
specific for the flagellin gene from which they are 
derived and are derived from the central region of the 
gene. In one embodiment, oligonucleotides in accordance 
with the present invention, which also include 
oligonucleotides from the previously sequenced E. coli Hi, 
H7, H12 and H48 genes, are those shown in Table 3. 

The 45 sequences (see Table 3) provide a panel to 
which newly sequenced genes can be compared to select 
specific oligonucleotides for those newly sequenced genes. 

in a second aspect the invention provides a method of 
detecting the presence of E. coli of a particular H 
serotype in a sample, the method comprising the step of 
specifically hybridising at least one nucleic acid 
molecule derived from a flagellin gene, wherein the at 
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least one nucleic acid molecule is specific for a 
particular flagellin gene associated with the H serotype, 
to any E. coll in the sample which contain the gene, and 
detecting any specifically hybridised nucleic acid 
molecules, wherein the presence of specifically hybridised 
nucleic acid molecules identifies the presence of the H 
serotype in the sample. 

In one preferred embodiment the detection method is a 
Southern blot method. More preferably, the nucleic acid 
molecule is labelled and hybridisation of the nucleic acid 
molecule is detected by autoradiography or detection of 
fluorescence . 

Preferred nucleic acid molecules for the detection of 
particular flagellin genes are listed in Table 3. 

In a third aspect the invention provides a method of 
detecting the presence of E. coli of a particular H 
serotype in a sample, the method comprising the step of 
specifically hybridising at least one pair of nucleic acid 
molecules to any E, coll in the sample which contains the 
flagellin gene for the particular H serotype, wherein at 
least one of the nucleic acid molecules is specific for 
the particular flagellin gene associated with the H 
serotype, and detecting any specifically hybridised 
nucleic acid molecules, wherein the presence of 
specifically hybridised nucleic acid molecules identifies 
the presence of the H serotype in the sample. 

In one preferred embodiment the detection method is a 
polymerase chain reaction method. More preferably, the 
nucleic acid molecules are labelled and hybridisation of 
the nucleic acid molecule is detected by electrophoresis. 

It is recognised that there may be instances where 
spurious hybridisation will arise through the initial 
selection of a sequence found in many different genes but 
this is typically recognisable by, for instance, 
comparison of band sizes against controls in PGR gels, and 
an alternative sequence can be selected. 
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In a fourth aspect the invention provides a method 
for detecting the presence of a particular O serotype and 
H serotype of E. coli in a sample, the method comprising 
the following steps: 

(a) specifically hybridising at least one nucleic 
acid molecule, derived from and specific for a gene 
encoding a transferase or a gene encoding an enzyme for 
the transport or processing of a polysaccharide or 
oligosaccharide unit, the gene being involved in the 
synthesis of a particular E. coli O antigen, to any E. 
coli in the sample which contain the gene; 

(b) specifically hybridising at least one nucleic 
acid molecule derived from and specific for a particular 
flagellin gene associated with that H serotype, to any E. 
coli in the sample which contain the gene; and 

(c) detecting any specifically hybridised nucleic 
acid molecules . 

Preferred nucleic acid molecules for the detection of 
particular flagellin genes are listed in Table 3 . 

In one preferred embodiment, the sequence of the 
nucleic acid molecule specific for the O antigen is 
specific to the nucleotide sequence encoding the 0111 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of wbdH 
(nucleotide position 739 to 1932 of Figure 5), wzx 
(nucleotide position 8646 to 9911 of Figure 5), wzy 
(nucleotide position 9901 to 10953 of Figure 5), wbdM 
(nucleotide position 11821 to 12945 of Figure 5) and 
fragments of those molecules of at least 10-12 nucleotides 
in length. Particularly preferred nucleic acid molecules 
are those set out in Tables 8 and 8A, with respect to the 
above mentioned genes. 

In another preferred embodiment, the sequence of the 
nucleic acid molecule specific for the O antigen is 
specific to the nucleotide sequence encoding the 0157 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of wbdN 
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(nucleotide position 79 to 861 of Figure 6), whdO 
(nucleotide position 2011 to 2757 of Figure 6), wbdP 
(nucleotide position 5257 to 6471 of Figure 6), wbdR 
(nucleotide position 13156 to 13821 of Figure 6), wzx 
(nucleotide position 2744 to 4135 of Figure 6) and wzy 
(nucleotide position 858 to 2042 of Figure 6) and 
fragments of those molecules of at least 10-12 nucleotides 
in length. Particularly preferred nucleic acid molecules 
are those set out in Tables 9 and 9A, with respect to the 
above mentioned genes . 

In one preferred embodiment the detection method is a 
Southern blot method. More preferably, the nucleic acid 
molecule is labelled and hybridisation of the nucleic acid 
molecule is detected by autoradiography or detection of 
fluorescence . 

In a fifth aspect the invention provides a method for 
detecting the presence of a particular O serotype and H 
serotype of E. coli in a sample, the method comprising the 

following steps: 

(a) specifically hybridising at least one pair of 
nucleic acid molecules, at least one of which is derived 
from and specific for a gene encoding a transferase or a 
gene encoding an enzyme for the transport or processing of 
a polysaccharide or oligosaccharide lanit, the gene being 
involved in the synthesis of the particular E. coli O 
antigen, to any E. coli in the sample which contain the 
gene ; 

(b) specifically hybridising at least one pair of 
nucleic acid molecules, at least one of which is derived 
from and specific for a particular flagellin gene 
associated with the particular H serotype, to any E. coli 
in the sample which contain the gene; and 

(c) detecting any specifically hybridised nucleic 

acid molecules . 

Preferred nucleic acid molecules for the detection of 
particular flagellin genes are listed in Table 3. 
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In one preferred ertODodiment , the sequence of the 
nucleic acid molecule specific for the O antigen is 
specific to the nucleotide sequence encoding the 0111 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of wbdH 
(nucleotide position 739 to 1932 of Figure 5), wzx 
(nucleotide position 8646 to 9911 of Figure 5), wzy 
(nucleotide position 9901 to 10953 of Figure 5), wbdM 
(nucleotide position 11821 to 12945 of Figure 5) and 
fragments of those molecules of at least 10-12 nucleotides 
in length. Particularly preferred nucleic acid molecules 
are those set out in Tables Sand 8A, with respect to the 
above mentioned genes. 

In another preferred embodiment, the sequence of the 
nucleic acid molecule specific for the O antigen is 
specific to the nucleotide sequence encoding the 0157 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of wbdN {nucleotide 
position 79 to 861 of Figure 6) , wbdO (nucleotide position 
2011 to 2757 of Figure 6) , wbdP (nucleotide position 5257 
to 6471 of Figure 6) , wbdR (nucleotide position 13156 to 
13821 of Figure 6), wzx (nucleotide position 2744 to 4135 
of Figure 6) and wzy (nucleotide position 858 to 2042 of 
Figure 6) and fragments of those molecules of at least 10- 
12 nucleotides in length. Particularly preferred nucleic 
acid molecules are those set out in Tables 9 and 9A, with 
respect to the above mentioned genes. 

In one preferred embodiment the detection method is a 
polymerase chain reaction method. More preferably, the 
nucleic acid molecules are labelled and hybridisation of 
the nucleic acid molecule is detected by electrophoresis. 

The present inventors believe that based on the 
teachings of the present invention and available 
information concerning O antigen gene clusters, and 
through use of experimental analysis, comparison of 
nucleic acid sequences or predicted protein structures, 
nucleic acid molecules in accordance with the invention 
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can be readily derived for any particular O antigen of 
interest. Suitable bacterial strains can typically be 
acquired coiranercially from depositary institutions. 

There are currently 166 defined E. coli O antigens. 
Samples of the 166 different E. coli O antigen 
serotypes are available from Statens Serxim Institute 
Copenhagen , Denmark . 

The inventors envisage rare circumstances whereby two 
genetically similar gene clusters encoding serologically 
different O antigens have arisen through recombination of 
genes or mutation so as to generate polymorphic variants. 

In these circumstances multiple pairs of oligonucleotides 
may be selected to provide hybridisation to the specific 
combination of genes. The invention thus envisages the 
use of a panel containing multiple nucleic acid molecules 
for use in the method of testing for O antigen in 
conjunction with H antigen, wherein the nucleic acid 
molecules are derived from genes encoding transferases 
and/or enzymes for the transport or processing of a 
polysaccharide or oligosaccharide unit including wzk or 
wzy genes, wherein the panel of nucleic acid molecules is 
specific to a particular O antigen. The panel of nucleic 
acid molecules can include nucleic acid molecules derived 
from O antigen sugar pathway genes where necessary. 

The inventors also found two mutated flagellin genes 
from H type strains for H3 5 and H54 which have insertion 
sequences inserted into normal flagellar genes identical 
or near identical to that that of the Hll and H21 type 
strains respectively. Thus, primers for Hll and H21 
(listed in Table 3) would also amplify fragments in H35 
and H54, which differ in sizes to those in Hll and H21 
respectively. The inventors also provide two pairs of 
primers each for H3 5 and H54 based on the insertion 
sequence (see H35 and H54 columns in Table 3). The use of 
one of them in combination with one of the Hll or H21 
primers will generate a PCR band only in H3 5 or H54 
respectively, and this will also differentiate H35 and H54 
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from Hll and H21 respectively. 

The present invention also relates to methods of 
detecting the presence of particular E. coll H antigens or 
H antigen and O antigen combinations where one or more 
nucleic acid molecules which generate a particular size 
fragment indicative of the presence of that H antigen are 
used or in which the combination of one antigen specific 
primer for that H antigen with another primer for a 
related H antigen provides for the detection of the 
particular H antigen by hybridisation to the relevant 
gene. Preferably, the H antigen is Hll, H21, H35 or H54 . 

The pairs of nucleic acid molecules where the method 
of the fifth aspect is used may both hybridise to the 
relevant H or O antigen gene or alternatively only one may 
hybridise to the relevant gene and the other to another 
site. 

The inventors recognise in applying the methods of 
the invention for detecting combinations of O and H 
antigens to samples, that the methods do not indicate 
whether a positive result for a particular O and H antigen 
combination arises because the O and H antigen are 
present on a single E. coli strain present in the sample 
or are present on different E. coli strains present in the 
sample. Because the ability to identify the presence of 
E. coli strains with particular O and H antigen 
combinations is highly desirable (due to the relationship 
between particular combinations and pathogenicity) the 
determination that a particular combination is present in 
a sample can be followed by isolation of single colonies 
and checking whether the they contain the relevant 
combination by using the same method again or using 
antibody labelled magnetic beads to separate cells 
expressing the particular O or H antigen and then testing 
the isolated cells for the other serotype. 

In addition, as mentioned above, the present 
inventors have established the existence of H7 primers 
specific to the 0157 and 055 serotypes. Using such 
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primers it is possible to detect particular O and H 
antigen combinations with the use of H specific nucleic 
acid molecules. 

In a sixth aspect the invention provides a method for 
detecting the presence of a particular O serotype and H 
serotype of E. coli in a sample, the method comprising the 
following steps : 

(a) specifically hybridising at least one nucleic 
acid molecule, derived from and specific for a gene 
encoding a flagellin associated with a particular E. coli 

H antigen serotype to any E, coli carrying the gene and 

present in the sample; 

and 

(b) detecting the at least one specifically 
hybridised nucleic acid molecule, wherein the at least one 
nucleic acid molecule is specific for the particular 
combination of O and H antigen. 

Preferably the combination is 055:H7 or 0157:H7. 

The ability to detect the 0157:H7 combination from a 
particular H7 primer or pair is of particular use given 
the association of this combination with pathogenic 
strains . 

In a seventh aspect the present invention provides a 
method for testing a food derived sample for the presence 
of one or more particular E. coli O antigens and H 
antigens comprising testing the sample by a method of the 
fourth, fifth or sixth aspect the invention. 

In an eighth aspect the present invention provides a 
method for testing a faecal derived sample for the 
presence of one or more particular E. coli O antigens and 
H antigens comprising testing the sample by a method of 
the fourth, fifth or sixth aspect the invention. 

In a ninth aspect the present invention provides a 
method for testing a patient or animal derived sample for 
the presence of one or more particular E. coli O antigens 
and H antigens comprising testing the sample by a method 
of the fourth, fifth or sixth aspect the invention. 
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Preferably, the method of the seventh, eighth or 
ninth aspect of the invention is a polymerase chain 
reaction method. More preferably the oligonucleotide 
molecules for use in the method are labelled. Even more 
preferably the hybridised nucleic acid molecules are 
detected by electrophoresis . 

In the above described methods it will be understood 
that where pairs of nucleic acid molecules are used one of 
the nucleic acid molecules may hybridise to a sequence 
that is not from the O antigen transferase, p^^zx or wzy 
gene or the flagellin gene. Further where both hybridise 
to these genes the O antigen molecules may hybridise to 
the same or a different one of these genes. 

In a tenth aspect the present invention provides a kit 
for identifying the H serotype of coli, the kit 

comprising : 

at least one nucleic acid molecule derived from and 
specific for an E. coll flagellin gene. 

In an eleventh aspect the present invention provides 
a kit for identifying the H and O serotype of E. coli, the 
kit comprising : 

(a) at least one nucleic acid molecule derived from and 
specific for an E. coli flagellin gene; and 

(b) at least one nucleic acid molecule derived from 
and specific for a gene encoding a transferase or a gene 
encoding an enzyme for the transport or processing of a 
polysaccharide or oligosaccharide unit, the gene being 
involved in the synthesis of a particular E. coli O 
antigen. 

The nucleic acid molecules may be provided in the 
same or different vials. The kit may also provide in the 
same or separate vials a second set of specific nucleic 
acid molecules . 

Particularly preferred nucleic acid molecules for 
inclusion in the kits are those specified in Tables 3, 8, 
8A, 9 and 9A as described above. 



PCT/AU99/00385 
WO 99/61458 

- 13 - 



10 



15 



20 



25 



30 



35 



DEFINITIONS 

in this specification, we have used term "flagellin gene" 
in many cases where previously one would have used -fliC", 
to allow for the uncertainty as to locus introduced by 
recent observations. However, uncertainty as to the locus 
does not alter the fact that most E. coli strains express 
a single H antigen and that a single flagellin gene 
sequence per strain is required to give the genetic basis 
for H antigen variation . Any use of the name fliC in this 
specification where a different locus is later shown to be 
involved would not affect the validity of conclusions 
drawn regarding application of information based on the 
sequence, where the conclusions do not relate to the map 
position. Thus it is generally the nucleic acid molecule 
itself which is of importance rather than the name 
attributed to the gene. When it is known or suspected 
that the gene encoding the H antigen is not in the fliC 
locus, we use the term flagellin rather than fliC. 

The phrase, "a nucleic acid molecule derived from a 
gene" means that the nucleic acid molecule has a 
nucleotide sequence which is either identical or 
substantially similar to all or part of the identified 
gene. Thus a nucleic acid molecule derived from a gene 
can be a molecule which is isolated from the identified 
gene by physical separation from that gene, or a molecule 
which is artificially synthesised and has a nucleotide 
sequence which is either identical to or substantially 
similar to all or part of the identified gene. While some 
workers consider only the DNA strand with the same 
sequence as the mRNA transcribed from the gene, here 
either strand is intended. 

Transferase genes are regions of nucleic acid which 
have a nucleotide sequence which encodes gene products 
that transfer monomeric sugar units . 

Flippase or wzx genes are regions of nucleic acid 
which have a nucleotide sequence which encodes a gene 
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product that flips oligosaccharide repeat units generally 
composed of three to six monomeric sugar units to the 
external surface of the membrane. 

Polymerase or wzy genes are regions of nucleic acid 
which have a nucleotide sequence which encodes gene 
products that polymerise repeating oligosaccharide units 
generally composed of 3-6 monomeric sugar units. 

The nucleotide sequences provided in this 
specification are described as anti-sense sequences. This 
term is used in the same manner as it is used in Glossary 
of Biochemistry and Molecular Biology Revised Edition, 
David M. Click, 1997 Portland Press Ltd., London on page 
11 where the term is described as referring to one of the 
two strands of double-stranded DNA usually that which has 
the same sequence as the mRNA. We use it to describe this 
strand which has the same sequence as the mRNA. 
NOMENCZJ^TURE 

Synonyms for coli 0111 rth 
Current names Our names Bastin et al . 1991 

wbdH orfl 

wbdl orf3 ""^E;^ 

mane orf4 rfbM* 

manB orf5 rfbK* 

w5S orf6 orf6.7* 

^zx orfS orf8.9 and rfbX* 

wzy orf 9 

WbdL orf 10 

*^^Noinenclature according^ Bastin D.A. , et al. 1991 "Molecular 
cloning and expression in Escherichia coli K-12 of the rfh gene 
cluster determining the O antigen of an coli 0111 strain". Uol . 
Microbiol. 5:9 2223-2231. 

Other Synonyms 

wzy rfc 

wzx rfbX 

rmlA rfbA 

rmlB rfbB 

rmlC rfbC 

rmlD rfbD 

glf orf 6* 

wbbi orf3#, orf 8* of coli K-12 
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wbbJ orf2#, orf9* of coli K-12 

wbbK orfl#, orflO* of coli K-12 

wbbL orf5#, orf 11* of coli K-12 

# Nomenclature according to Yao, Z. And M. A. Valvano 1994. 

5 ^Genetic analysis of the O-specific lipopolysaccharide biosynthesis 

region (rfb) of Eschericia coli K-12 W3110: identification of genes 
the confer groups-specif icty to Shigella flexineri serotypes Y and 
4a". J. Bacteriol. 176: 4133-4143. 

Nomenclature according to Stevenson et al . 1994. ^Structure of 
10 the 0-antigen of E. coli K-12 and the sequence of its rfb gene 

cluster". J. Bacteriol 176: 4144-4156. 

• The O antigen genes of many species were given rfb names (rfbA 
etc) and the O antigen gene cluster was often referred to as the 
rfb cluster. There are now new names for the rfb genes as shown 

15 in the table. Both terminologies have been used herein, depending 

on the source of the information. 
In the claims that follow and in the siimmary of the 
invention, except where the context requires otherwise due 
to express language or necessary implication, the word 

20 "comprising" is used in the sense of ^^including" , i.e. the 

features specified may be associated with further features 
in various embodiments of the invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

25 Figure 1 shows Eco Rl restriction maps of cosmid 

clones PPR1054, pPRl055, pPRl056, pPRl058, pPRl287 which 
are subclones of E. coli 0111 O antigen gene cluster. The 
thickened line is the region common to all clones. 
Broken lines show segments that are non-contiguous on the 

30 chromosome. The deduced restriction map for E. coli 

strain M92 is shown above. 

Figure 2 shows a restriction mapping analysis of E. 
coli Olll O antigen gene cluster within the cosmid clone 
pPR1058. Restriction enzymes are: (B: BamHl; Bg: Sglll, 

35 E: EcoRl; H: Hindlll; K: Xpnl ; P: Ps tl; S: Sail and X: 

Xhol. Plasmids pPRl230, pPR1231, and pPRl288 are 

deletion derivatives of pPR1058, Plasmids pPR 1237, 
PPR1238, PPR1239 and pPRl240 are in pUC19 . Plasmids 
PPR1243, PPR1244, pPR1245, pPRl246 and pPR1248 are in 

40 pUClB, and pPR1292 is in pUC19 . Plasmid pPRl270 is in 



• 
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Probes 1, 2 and 3 were ^ isolated as internal 



fragments of pPRl246, pPRl243 and pPRl237 respectively. 
Dotted lines indicate that subclone DNA extends to the 
left of the map into attached vector. 

Figure 3 shows the structure of E. call 0111 O 
antigen gene cluster. 

Figure 4 shows the structure of E. coli 0157 O 
antigen gene cluster. 

Figure 5 shows the nucleotide sequence (SEQ ID NO: 
45) of the E, coli 0111 O antigen gene cluster. Note: (1) 
The first and last three bases of a gene are underlined 
and of italic respectively.; (2) The region which was 
previously sequenced by Bastin and Reeves 1995 Sequence 
and anlysis of the O antigen gene (rfb) cluster of 
Escherichia coli Olll" Gene 164: 17-23 is marked. 

Figure 6 shows the nucleotide sequence (SEQ ID NO: 
56) of the E. coli 0157 O antigen gene cluster. Note: (1) 
The first and last three bases of a gene (region) are 
underlined and of italic respectively (2) The region 
previously sequenced by Bilge et al . 1996 ^^Role of the 
Escherichia coli 0157 -H7 O side chain in adherence and 
analysis of an rfb locus". Inf. and Immun 64:4795-4801 is 



Figures 7 to 9 show the nucleotide sequences (SEQ ID 
NOS: 66 to 68 respectively) obtained for flagellin genes 
from E. coli type strains for HI to H3 respectively. The 
primer positions listed in Table 3 are based on treating 
the first nucleotide of each of these sequences as No. 1. 

Figures 10 to 18 show the nucleotide sequences (SEQ 
ID NOS: 6 to 14 respectively) obtained for flagellin genes 
from E. coli type strains for H4 to H12 respectively. The 
primer positions listed in Table 3 are based on treating 
the first nucleotide of each of these sequences as No. 1. 

Figures 19 and 2 0 show the nucleotide sequences (SEQ 
ID NOS: 15 to 16 respectively) obtained for flagellin 
genes from E. coli type strains for H14 and H15 
respectively. The primer positions listed in Table 3 are 



marked . 
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based on treating the first nucleotide of each of these 
sequences as No. 1. 

Figures 22 and 26 show the nucleotide sequences (SEQ 

ID NOS: 17 to 21 respectively) obtained for flagellin 

genes from E. coli type strains for H17 and H21 
respectively. The primer positions listed in Table 3 are 

based on treating the first nucleotide of each of these 

sequences as No. 1. 

Figures 27 to 39 show the nucleotide sequences (SEQ 
ID NOS: 22 to 34) obtained for flagellin genes from E. 
coli type strains for H23 to H35 respectively. The primer 
positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No. 1. 

Figures 40 to 49 show the nucleotide sequences (SEQ 
ID NOS: 35 to 44) obtained for flagellin genes from E. 
coli type strains for H37 to H46 respectively. The primer 
positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No. 1. 

Figures 50 to 55 show the nucleotide sequences (SEQ 
ID NOS: 46 to 51) obtained for flagellin genes from E. 
coli type strains for H47 to H52 respectively. The primer 
positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No. 1. 

Figures 56 to 58 show the nucleotide sequences (SEQ 
ID NOS: 52 to 54) obtained for flagellin genes from E. 
coli type strains for H54 to H56 respectively. The primer 
positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No . 1. 

Figure 59 shows the nucleotide sequence (SEQ ID NO: 
55) obtained for the flagellin gene from E. coli H7 strain 
M1179. The primer positions listed in Table 3 are based 
on treating the first nucleotide of each of these 
sequences as No. 1. 

Figures 60 to 58 show the nucleotide sequences (SEQ 
ID NOS: 57 to 65 respectively) obtained for flagellin 
genes from E. coli strains M1004, M1211, M1200, M1686, 
M1328, M917, M527 , M973, and M918 respectively. The primer 
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positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No. 1. 

Figure 69 shows the nucleotide sequence (SEQ ID NO: 

1) of the flic gene and DNA flanking the fliC gene from 
the H25 type strain. 

Figure 7 OA shows the nucleotide sequence (SEQ ID NO: 

2) obtained from the 5' end of the insert of plasmid 
pPR1989. The insert of plasmid pPR1989 encodes the second 
flagellin gene of the H55 type strain. 

Figure 70B shows the nucleotide sequence (SEQ ID NO: 

3) obtained from the 3' end of the insert of plasmid 
pPR1989. The insert of plasmid pPR1989 encodes the second 
flagellin gene of the H55 type strain. 

Figure 71 shows the nucleotide sequence (SEQ ID NO: 4) 
obtained from the 5 ' end of the insert of plasmid pPRl993 . 

The insert of plasmid pPRl993 encodes the second 
flagellin gene of the H36 strain. 

Figure 72 shows the nucleotide sequence (SEQ ID NO: 5) 
obtained from the 3 ' end of the insert of plasmid pPR1993 . 

The insert of plasmid pPR1993 encodes the second 
flagellin gene of the H3 6 type strain. 

Figure 73 A shows the sequence of polylinker and the 
SD sequence of plasmid pTrc99A. 

Figure 73B shows the sequence of the junction region 
between the SD sequence and the start of flagellin gene in 
the plasmids used for the expression of flagellin genes. 

BEST METHOD OF CAXmYlNG OUT THE INVENTION 

In carrying out the methods of the invention with 
respect to the testing of particular sample types 
including samples from food, patients, animals and faeces 
the samples are prepared by routine techniques routinely 
used in the preparation of such samples for DNA based 
testing. The steps for testing the samples using 

particular nucleic acid molecules in assay formats such as 
Southern blots and PGR are performed under routinely 
determined conditions appropriate to the sample and the 
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nucleic acid molecules. 
H ant:igen 

Materials and Methods 

5 1. Bacterial strains and plasmid: 

There are 54 H types in E. coli [Ewing, W.H. : Edwards 
and Ewing's identification of the £'nter*oJbacteriaceae. , 
Elsevier Science Publishers, Amsterdam, The Netherlands, 
1986] : note H antigens from 1 to 57 were listed and that 
10 13, 22 and 57 are not valid. All the standard H type 

strains except H16 were obtained from the Institute of 
Medical and Veterinary Science, Adelaide, Australia. The 
primary stocks are hold at the Statens Serum Institut, 
Copenhagen , Denmark . 
15 The additional H7 strains used are listed in Table 1. 

We do not have the type strain for H16. It is known 
that the H3 type strain is biphasic and can also express 
the H16 flagellin gene [Ratiner, Y. A. (1985) "Two genetic 
arrangements determining flagellar antigen specificities 
20 in two diphasic E. coli strains. FEMS Microbiol Lett 19: 

317-323] . We have sequenced and cloned the H16 flagellin 
gene from the H3 type strain (see below) , 

E. coli K-12 strain C600 hsm hsr fliC: :TnlO 
[Kuwajiwa, G. (1988) "Flagellin domain that affects H 
25 antigenicity of E, coli K-12" J. Bacterid. 170; 485-488] 

(laboratory stock no. M2126) was obtained from Dr Benita 
Westerlund-Wikstrom of the Department of Biosciences, 
University of Helsinkin, Finland. E. coli K-12 strain 
EJ2282 (laboratory no. P5560) is a fliC deletion strain, 
30 and was obtained from Dr Masatoshi Enomoto of the 

Department of Biology, Okayama University, Japan 
[Tominaga, A. M. A.-H. Mahmound, T. Mokaihara and M. 
Enomoto ( 1994 ) "Molecular characterization of intact but 
cryptic, flagellin genes in the genus Shigella.: Mol , 
35 Microbiol. 12: 277-285]. 

Plasmid pTrc99A was purchased from Pharmacia LKB 
(Melbourne, VIC, Australia) . 
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2. Antisera 

Antisera against HI, H3 , H8, H14, H15, H17 , H23, H24, 
H25, H25, H29, H30, H31, H32, H33, H35, H36, H37, H38, 
H39, H43, H44, H46, H47 , H48, H49, H52, H53 , H54, H55, and 
H56 were obtained from the Institute of Medical and 
veterinary Science, Adelaide, Australia. Antisera against 
H2, H4, H5, H6, HI, H9 , HIO, Hll , H12 , H16, HIS, H19 , H20, 
H21, H27, H28, H34, H40, H41, H42, H45, and H51 were 
obtained from Denka Seiken Co., Ltd, Tokyo, Japan. 

Antisera to type H50 was not available from any known 
source . 

The antisera available were checked against the 
appropriate type strains to confirm the specificities of 
both flagellin H antigen and H antisera: 52 sera (all 
those except anti-Hl6 serum listed above) gave a positive 
reaction with the corresponding type strains for that 
serum. 

3 . Agglutination test : 

Bacteria from 1 ml of an overnight culture grown in 
Luria broth (Difco Tryptone, lOg/1; Difico yeast extract, 
5g/l; NaCl, 0.5 g/1; pH 7.2) at 3 0oC was centrifuged ( 
4000 rpm/10 min) and the bacteria pellet resuspended in 
100 ml of saline. The agglutination test was carried out 
by mixing equal volumes (5 ml) of both the cells and 
antiserum on a slide. The slide was rocked for 1 minute 
and then observed for agglutination. For all agglutination 
tests, saline containing no antiserum was mixed with cells 
to be used as a negative control. 

For testing the H specificities of strain M2126 or 
strain P5560 carrying plasmid containing cloned flagellin 
genes, cells of M2126 or P5550 were used as an additional 
negative control. 

All agglutination tests were first carried out using 
undiluted antisera (note that the antisera we used have 
been diluted before reaching our hands), except for anti- 
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Hll, anti-H34, anti-H52 and anti-H2 5 seriim for which we 

used 1:10 dilutions to avoid background agglutination. In 

cases for which cross -reactions have been reported, we 

carried out agglutination tests using serial dilutions of 
5 sera (see section 10.1) 



4. Motility test: 

The motility of strain M2126 or strain P5560 carrying 
cloned flagellin genes was examined microscopically. 1 ml 

10 of overnight culture grown in Luria broth (Difco Tryptone, 

lOg/1; Difico yeast extract, 5g/l; NaCl, 0.5 g/1; pH 7.2) 
at 3 0oC was inoculated into 10 ml of Luria broth, and the 
culture was shaken at 100 rpm at 30oC to early log phase 
(OD 625 = 0.2). A loopful of culture was placed on a slide 

15 and examined under a microscope. Motility of individual 

cells was easily distinguished from Brownian movement and 
streaming, and presence or absence of motility recorded. 

5. Isolation of chromosomal DNA: 
Chromosomal DNA from all the 53 H type strains and 

the strains listed in Table 1 was isolated using the 
Promega Genomic isolation kit (Madison WI USA) . Each 
chromosomal DNA sample was checked by gel electrophoresis 
of the DNA and by PGR amplification of the mdh gene using 
oligonucleotides based on the E. coli K-12 mdh gene [Boyd, 
E.F., Nelson, K. , Wang, F.-S., Whittam, T.S. and Selander, 
R.K. : Molecular genetic basis of allelic polymorphism in 
malate dehydrogenase (mdh) in natural populations of 
Escherichia, coli and Salmonella, enterica. Proc . Natl. 
Acad. Sci. USA 91 (1994) 1280-1284]. 

6. PCR amplification of flagellin gene: 
Flagellin genes from different strains were first 

PCR amplified using one of the following four pairs of 
35 oligonucleotides : 

#1285 (5 ' -atggcacaagtcattaatac) and 
#12 86 ( 5 ' -ttaaccctgcagtagagaca) ; 



20 



25 



30 
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#1417 (5 ' -ctgatcactcaaaataatatcaac) and 

#1418 ( 5 ' -ctgcggtacctggttggc ) ; 

#1431 (5 ' -atggcacaagtcattaatacccaac) and 

#1432 (5' -ctaaccctgcagcagagaca) : 

#1575 (5 ' -gggtggaaacccaatacg) and 

#1576 (5' -gcgcatcaggcaatttgg) 

PGR reactions were carried out under the following 
conditions: denaturing, 94^0/30 ■; annealing, temperature 
varies (refer to Table 2)/30'; extension, 72°C/1'; 30 
cycles. The PGR product was purified using the Promega 
Wizard PGR purification kit (Madison WI USA) before being 
sequenced. 

The H36 and H53 type strains gave two PGR bands using 
primer pairs #1431/#1432 and #1417/#1418 respectively, and 
were not sequenced. 

7. Enzymes and buffers: 

Restriction endonucleases and DNA T4 ligase were 

purchased from Boehringer Mannheim (Gastle Hill, NSW, 

Australia) . Restriction enzymes were used in the 
recommended commercial buffer. 

8. Sequencing of the flagellin genes: 

Each PGR product was first sequenced using the 
oligonucleotide primers used for the PGR amplification. 
Primers based on the obtained sequence were then used to 
sequence further, and this procedure was repeated until 
the entire PGR product was sequenced. 

The sequencing reactions were performed using the 
DyeDeoxy Terminator Cycle Sequencing method (Applied 
Biosystems, GA, USA), and reaction products were analysed 
using fluorescent dye and an ABI377 automated sequencer 
(GA, USA) . 

Sequence data were processed and analysed using 
Staden programs [Sacchi GT, Zanella R C, Caugant D A, 
Frasch G E, Hidalgo N T, Milagres L G, Pessoa L L, Ramos S 
R, Gamargo M G G and Melles G E A "Emergence of a new 
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clone of serogroup C Neisseria meningitidis in Sao Paulo, 
Brazil" J. Clin. Microbiol. 30 (1992) 1282-1286; 
Staden, R. : Automation of the computer handling of gel 
reading data produced by the shotgun method of DNA 
sequencing. Nucl. Acids Res. 10 {1982a) 4731-4751; 
Staden, R. : An interactive graphics program for comparing 
and aligning nucleic acid and amino acid sequences. Nucl. 
Acids Res. 10 (1982b) 2951-2961; 

Staden, R. : Computer methods to locate signals in nucleic 
acid sequences. Nucl. Acids Res. 12 (1984a) 505-519; 
Staden, R. : Graphic methods to determine the function of 
nucleic acid sequences. A summary of ANALYSEQ options. 
Nucl. Acids Res. 12 (1984b) 521-538; 

Staden, R, : The current status and portability of our 
sequence handling software. Nucl. Acids Res. 14 (1986) 
217-231] . 

We were able to PCR amplify flagellin genes from H 
type strains for H7 , 23, 12, 51, 45, 49, 19, 9, 30, 32, 
26, 41, 15, 20, 28, 46, 31, 14, 18, 6, 34, 48, 43, 10, 52, 
and also from H7 strains ml004, m527, ml686, ml211, ml328, 
m973, mll79, ml200, m917, and m918 using primers #1575 and 
#1576 which are based on sequences 51-34 bp upstream and 
37-54 bp downstream of start and end of the E. coli K-12 
flic gene respectively. Thus, the full sequence of the 
flagellin gene from these strains was obtained and the use 
of flanking sequence for primers makes it highly likely 
that they are at the fliC locus. 

For other strains, we were only able to amplify the 
flagellin gene using one or more of the other three pairs 
of primers, which are based on sequence within the fliC 
gene, and thus only partial sequence was obtained. These 
amplicons may be of the fliC gene or one of the 
alternative flagellin genes. The flagellin gene sequences 
from H type strains for H40, 8, 21, 47, 11, 27, 35, 2, 3, 
24, 37, 50, 4, 44, 38, 55, 29, 33, 5, and 56 obtained are 
lacking 18 and 14 codons at 5' and 3' ends respectively. 
The flagellin gene sequence of H39 obtained using primers 
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#1285/#1286 lacks 18 and 19 codons at 5' and 3' ends 
respectively. The flagellin gene sequence of H type 
strains of H17, 25 and 42 lack 23 and 21 codons at 5' and 
3' ends respectively. The flagellin gene sequence of the 
H type strain for H54 lacks 23 and 12 codons at the 5' and 
3' ends respectively. There is very little variation in 
the sequence at the two ends of flagellin genes and 
antigenic variation is due to variation in the central 
region of the gene. The absence of sequence for the ends 
of some of the flagellin genes is not important for the 
purpose of the present invention relating to the detection 
of antigenic variation by DNA sequence based means. 

The flic genes from H type strains of HI, H7 and H12 
have been sequenced previously [Schoenhals, G. and 
Whitfield, C: Comparative analysis of flagellin sequences 
from Escherichia coli strains possessing serologically 
distinct flagellar filaments with a shared complex surface 
pattern. J. Bacterid. 175 (1993) 5395-5402] and ,we did 
not sequence the gene from the HI strain. 

We have sequenced fliC genes from a set of H7 strains 
with different O antigens, including that of fliC from the 
H7 type strain as one of the set: we have found four 
differences from the published H7 sequence (GenBank 
accession number L07388) which we believe are due to 
errors in the published sequence. 

We have also re-sequenced the fliC gene from the H12 
type strain, and have found one difference from the 
published H12 sequence (GenBank accession nvimber L07389) 
which we believe is due to an error in the published 
sequence . 

The flagellin genes from type strains H35 and H54 
were also amplified using primers #1431/#1432, which are 
based on sequence within the fliC gene. Sequence data 
revealed that these two genes would be non- functional due 
to insertion sequence inserted in the middle of them. We 
have sequenced them to facilitate selection of primers for 
the functional flagellin genes. 
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9. Cloning of flagellin genes 

DNA was digested for 2 hr at 37°C with appropriate 
restriction enzyme (s). The reaction product was then 
extracted once with phenol, and twice with ether. DNA was 
precipitated with 2 vols of ethanol and resuspended in 
water before the ligation reaction was carried out. 
Ligation was carried out 0/N at 4°C and the ligated DNA was 
electroporated into one of the E. coli fliC mutant 
strains . 

9.1. Cloning of flagellin genes from type strains for HI, 
H2, H3, H5, H6, H7 , H9, HlO , Hll, H12 , H14, H15 , HIS, H19, 
H20, H21, H24, H26, H27 , H28, H29, H31, H34, H38, H39, 
H41, H42, H43, H45, H46 , H49 , H51, H52, and H56: 

The full flagellin gene was PGR amplified using 
primers #1868 and #1870 (Table 3A) . Both these primers 
are based on the sequences of the H7 flagellin gene of the 
H7 type strain. #1868 is the 5' primer: there is an i\^coI 
site incorporated into the primer (Table 3B) and the 
flagellin gene starts at base 3 of the Ncol site. The 3' 
primer #1870 has a BamHI site incorporated downstream of 
the stop codon of the flagellin gene (Table 3B) . PGR 
reactions were carried out under the following conditions: 
denaturing, 94oC/30'; annealing, temperature varies (refer 
to Table 3A)/30'; extension, 72oC/l'; 30 cycles. The PGR 
product was purified using the Promega Wizard PGR 
purification kit (Madison WI USA) before being digested by 
restriction enzymes Ncol and BamUl and cloned into the 
NcoT/BairiHl sites of plasmid pTrc99A. 

Plasmid pTrc99A has a strong trc promoter upstream of 
the polylinker. Downstream of the promoter, it contains 
the ribosome binding site (SD sequence, see Fig 73) which 
is located 8bp upstream of the ATG site within the Ncol 
site. The polylinker and the SD sequence of pTrc99A are 

shown in Fig 73 . 

The plasmids generated were given pPR numbers, and 
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they are listed in Table 3A. In these plasmids, the 
expression module consists of the trc promoter, the SD 
sequence, and the full flagellin gene. The sequence of the 
junction region between the SD sequence and the start of 
flagellin gene is shown in Fig 73 . 

For flagellin genes from type strains for H6, HI, H9, 
HIO, H12, H14, H18, H19 , H20, H26, H28, H31, H41, H43, 
H45, H46, H49, H51, and H52 , we have the full sequence for 
each gene and the primer sequences (#1868 and #1870) are 
conserved among them. The cloned genes therefore have the 
same sequence as those from the type strains. 

For flagellin genes from type strains for Hi, H15 and 
H34, we also have the full sequence. The previously 
published sequence of the flagellin gene from the HI type 
strain was extracted from GenBank (accession niamber 
L07387) and used. Primer #1868 is conserved in all three. 
But, primer #1870 has the third base of the fifth last 
codon in the Hi sequence changed from A to G, and the 
third base of the second last codon changed from C to T in 
the H15 and H34 sequences: these changes did not change 
the amino acid coded, so the cloned genes encode the same 
gene products as those of the type strains. 

For flagellin genes from type strains for H2, H3 , H5, 
Hll, H21, H24, H27, H29, H3 8, H39, H42 , and H56, we do not 
have the full sequences. In the plasmids carrying genes 
from these type strains, the expression module consists of 
the trc promoter, the SD sequence, and the full flagellin 
gene with the first and the last 21 base pairs being 
determined by the primer sequences which are based on the 
H7 flagellin gene of the H7 type strain. The sequence of 
the junction region between the SD sequence and the start 
of flagellin gene is shown in Fig 73 . 

9.2. Cloning of the flagellin gene from type strain of 
H23: 

The full flagellin gene was PGR amplified using 
primers #1868 and #1869 (Table 3A) . #1868 is the 5' 
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primer: there is an NcoX site incorporated into the primer 
(Table 3B) and the flagellin gene starts at base 3 of the 
Ncol site. The 3' primer #1869 has a Sail site 
incorporated downstream of the stop codon of the flagellin 
gene (Table 3B) . PGR reactions were carried out under the 
following conditions: denaturing, 94oC/30'; annealing, 
55OC/30'; extension, 72oC/l'; 30 cycles. The PGR product 
was purified using the Promega Wizard PGR purification kit 
(Madison WI USA) before being digested by restriction 
enzymes NcoX and Sail and cloned into the iVcoI/Sall sites 
of plasmid pTrc99A to give plasmid pPR1942 . 

Plasmid pTrc99A has a strong trc promoter upstream of 
the polylinker. Downstream of the promoter, it contains 
the ribosome binding site (SD sequence, see Fig 73) which 
is located 8bp upstream of the ATG site within the Ncol 
site. The polylinker and the SD sequence of pTrc99A are 
shown in Fig 73 . 

The expression module of pPRl942 consists of the trc 
promoter, the SD sequence, and the full flagellin gene. 
The sequence of the junction region between the SD 
sequence and the start of flagellin gene is shown in Fig 
73 . 

5.3. Cloning- of flagellin genes from type strains of H30, 
H32 and H33 : 

The full flagellin gene was PGR amplified using 
primers #1868 and #1871 (Table 3A) . #1868 is the 5' 
primer: there is an Ncol site incorporated into the primer 
(Table 3B) and the flagellin gene starts at base 3 of the 
Ncol site. The 3' primer #1871 has a PstI site 
incorporated downstream of the stop codon of the flagellin 
gene (Table 3B) . PGR reactions were carried out under the 
following conditions: denaturing, 94oG/30'; annealing, 
temperature varies (refer to Table 3A)/30'; extension, 
72oG/l'; 30 cycles. The PGR product was purified using the 
Promega Wizard PGR purification kit (Madison WI USA) 
before being digested by restriction enzymes Ncol and Ps tl 
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and cloned into the NcoT/ PstT sites of plasmid pTrc99A. 

Plasmid pTrc99A has a strong trc promoter upstream of 
the polylinker. Downstream of the promoter, it contains 
the ribosome binding site (SD sequence, see Fig 73) which 
5 is located 8bp upstream of the ATG site within the NcoT 

site. The polylinker and the SD sequence of pTrc99A are 
shown in Fig 73 . 

For flagellin genes from type strains for H3 0 and 
H32, we have the full sequence. Primer #1868 sequence is 

10 conserved in both of them. But, primer #1871 has the third 

base of the fourth last codon in both sequences changed 
from G to A to remove a PstX site (see Table 3B) : this 
change did not change the amino acid coded. The expression 
module consists of the trc promoter, the SD sequence, and 

15 the full flagellin gene coding for a gene product which is 

same as that of the type strain. The sequence of the 
junction region between the SD sequence and the start of 
flagellin gene is shown in Fig 73 . 

We do not have the full sequence for the flagellin 

20 gene from the H33 type strain. In the plasmid containing 

the H33 type strain gene, the expression module consists 
of the trc promoter, the SD sequence, and the full 
flagellin gene with the first and the last 21 base pairs 
been determined by the primer sequences which were used 

25 for the cloning of H30 and H32. The sequence of the 

junction region between the SD and the start of flagellin 
gene is shown in Fig 73 . 

9.4. Flagellin genes from type strains for H4 a.nd H17 : 
30 For the flagellin genes of H4 and H17 type strains 

the full sequence was not obtained, and the sequenced 
parts were PGR amplified and cloned into plasmid pPR1951 
to give in each case a gene in which the first 2 6 and the 
last 31 codons are based on the sequence of the H7 
35 flagellin gene of the H7 type strain. 



9.4.1 



Construction of expression plasmid vector 
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PPR1951 : 

The first 2 6 codons of the H7 flagellin gene was 
first PGR amplified using primers #1868 and #1872 (Table 
3B) . #1868 is the 5' primer: there is an Ncol site 
5 incorporated into the primer (Table 3B) and the flagellin 

gene starts at base 3 of the NcoX site. Primer #1872 was 
made to have the last two codons (codons 25 and 26) 
changed from CTG TCG (Leucine and Serine) to GGA TCC 
(Glycine and Serine) to generate a BamHI site. This PGR 

10 fragment was digested with i\rcoI and BanzHI before being 

cloned into the iSTcoI/BamHI sites of pTrc99A to make 
plasmid pPR1949 . 

The last 31 codons (including the stop codon) of the 
H7 flagellin gene was PGR amplified using primers #1884 

15 and #1871 (Table 3A) . The 5' primer, #1884, has the first 

two of the 31 codons changed from TCG AAA (Serine and 
Lysine) to TCT AGA (Serine and Arginine) to generate a 
Xbal site (Table 3B) . The 3' primer #1871 has a Pstl site 
incorporated downstream of the stop codon (Table 3B) . This 

20 PGR fragment was digested with Xbal and Pstl, and then 

cloned into the Xbal/Psfcl sites of pPR1949 to make plasmid 
PPR1951. 

9.4.2 Cloning of flagellin genes from the H4 and 

25 HI 7 type strains: 

The central regions of flagellin genes from type 
strains H4 and H17 were PGR amplified using primers #1878 
and #1885 (Table 3B) , which have a BairiHl and a Xbal 
incorporated at their ends respectively. PGR reactions 

30 were carried out under the following conditions: 

denaturing, 94oG/30'; annealing, 65oC/30'; extension, 
72oG/l'; 30 cycles. The PGR product was purified using the 
Promega Wizard PGR purification kit (Madison WI USA) 
before being digested by restriction enzymes BajnHI and 

35 Xbal and cloned into the Xbal/BaiTiHI sites of plasmid 

PPR1951 to make plasmids pPR1955 (H4) and pPR1957 (H17) . 

The expression module of plasmids pPR1955 and pPR1957 
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consists of the trc promoter, the SD sequence, the first 
24 codons of the H7 flagellin gene (of the H7 type 
strain) , 2 codons encoding Glycine and Serine, 292 or 293 
codons of the central region based on the flagellin gene 
obtained from the H4 or H17 type strain respectively, 2 
codons encoding Serine and Arginine, and then the last 29 
codons of the H7 flagellin gene (of the H7 type strain) . 

10. Expression of flagellin gene plasmids in E. coli 
strains lacking the fliC gene, and identification of the H 
antigens encoded by these plasmids : 

Plasmids carrying flagellin genes as described in 
section 9 (see Table 3A for a list) were electroporated 
into strains M2126 or P5560. Strains M2126 and P5560 do 
not have functional fliC genes, and are not motile when 
examined under a microscope. Transf ormants carrying any of 
the plasmids listed in Table 3A are motile when examined 
under a microscope. Thus, the flagellin genes in all of 
the plasmids are expressed. 

The antigenic specificity of the flagellin of each 
transf ormant was then determined by slide agglutination, 

10.1 Flagellin genes from type strains for H2, 

H5, H6, H7, H9, Hll, H14, H15 , H18, H19 , H20, H21, H23, 
H24, H26, H27 , H28 , H29 , H30, H31, H32, H33, H34, H39, 
H41, H42, H43, H45, H46, H49, H51, H52 , and H56: 

As shown in Table 3A, strains with plasmids carrying 
these flagellin genes expressed the same H antigen as 
their respective flagellin parent strain. 

For flagellin specificities H2 , H5, H6, H7 , H9 , H14, 
HIS, H18, H19, H20, H23 , H24, H26, H27, H28, H29, H31, 
H33, H39, H51, H52, and H56, there was no cross reaction 
reported between these flagellins and flagellin antisera 
for other H antigens [Ewing, W. H. : Edwards and Ewing's 
identification of the Enterohacteriaceae . , Elsevier 
Science Publishers, Amsterdam, The Netherlands, 1986], and 
we conclude that we have in each case sequenced the gene 
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encoding the flagellin of the expected specificity from 
the respective type strain. 

It has been observed that cross reactions exist 
between some type strains and certain antisera at 
5 different levels of dilution (of the antisera) , being Hll 

with anti-H21 and anti-H40, H21 with anti-Hll, H30 with 
anti-H32, H32 with anti-H30, H34 with anti-H24 and anti- 
H31, H41 with anti-H37 and anti-H39, H42 with anti-H6, H43 
with anti-H37, H45 with anti-H20, H46 with anti-H17, and 
10 H49 with anti-H39 [Ewing, W. H. : Edwards and Ewing's 

identification of the -Enterojbacteriaceae. , Elsevier 
Science Publishers, Amsterdam, The Netherlands, 1986] . We 
have tested strain M2126 or strain P5560 carrying plasmids 
containing flagellin genes obtained from each of these 
type strains (Hll, H21, H30, H32, H34, H41, H42, H43, H45, 
H46, and H49) with the appropriate cross -reacting 
antisera . 

For strain M2126 or strain P5560 carrying plasmids 
containing flagellin genes obtained from type strains Hll, 
H34, H41, H42, H43 , H45, H46, and H49, no cross reaction 
was found • We conclude that we have in each case sequenced 
the gene coding the flagellin of the expected specificity 
from the respective type strain. 

Cross reaction was observed for strain P5560 carrying 
25 plasmid pPR1948 (containing the flagellin gene obtained 

from the H3 0 type strain) with anti-H32 serum, strain 
P5560 carrying pPR1940 (containing the flagellin gene 
obtained from the H32 type strain) with anti"H3 0 seriim , 
and strain M2126 carrying plasmid pPR1995 (containing the 
30 flagellin gene obtained from the H21 type strain) with 

anti-Hll serum. 

We note that the reported cross reactions between the 
H30 type strain and anti-H32, the H32 type strain and 
anti-H30, and the H21 type strain and anti-Hll happened at 
35 a higher level of dilution (of antisera) than for all 

other type strains with the antisera mentioned above 
[Ewing, W. H.: Edwards and Ewing's identification of the 
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Enterobacteriaceae. , Elsevier Science Publishers, 
Amsterdam, The Netherlands, 1986]. We conclude that except 
for these three cases, the antiserum used were supplied at 
a dilution which did not exhibit cross reactions. For the 
three strains carrying flagellin genes cloned form type 
strains for H30, H32, and H21, it was necessary to further 
dilute the antiseriim. 

Strain P5560 carrying plasmid pPRl948 (containing the 
flagellin gene obtained from the H30 type strain) 
agglutinates with anti-H3 0 serum when the antiserum is 
diluted to 1:60, but agglutinates with anti-H32 serum only 
at a dilution of 1:10 and not at a 1:20 dilution (note 
that the antisera we used have been diluted before 
reaching our hands) . In contrast, strain P5560 carrying 
plasmid pPRl940 (containing flagellin gene obtained from 
the H32 type strain) agglutinates with anti-H32 serum when 
the antiserum is diluted at 1:100, but agglutinates with 
anti-H30 seriam only at a 1:10 dilution and not at a 1:10 
dilution. Thus, we conclude that the flagellin genes we 
sequenced from type strains for H30 and H32 encode 
flagellins of H30 and H32 specificities respectively. 

Strain M2126 carrying plasmid pPRl995 (containing the 
flagellin gene obtained from the H21 type strain) 
agglutinates with anti-H21 serum when the antiserum is 
diluted to 1:40, but agglutinates only with undiluted 
anti-Hll serum and not at a 1:10 dilution (note that the 
antisera we used have been diluted before reaching our 
hands). In contrast, strain M2126 carrying plasmid pPR1981 
(containing flagellin gene obtained from the Hll type 
strain) did not agglutinate with anti-H21 serum. Thus, we 
conclude that the flagellin genes we sequenced from type 
strains for H21 encodes flagellin of H21 specificity. 

20.2 Flagellin genes from type strains of HI and 

H12: 

These two genes are very similar in sequence, with 8 
a. a difference between the gene products. It has been 
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known that some cross-reaction exists between anti-H12 
serum and the HI type strain and between anti-Hl seriim and 
the H12 type strain [Ewing, W. H. : Edwards and Ewing's 
identification of the E'nterojbacteriaceae. , Elsevier 
Science Publishers, Amsterdam, The Netherlands, 1986] . 
Strain M2126 carrying pPR1920 (carrying a flagellin gene 
from the HI type strain. Table 3A) agglutinates with anti- 
Hl serum when the antiserum is diluted to 1:100, but 
agglutinates only with undiluted anti-Hl2 serum and not at 
a 1:10 dilution (please note that the antisera we used 
have been diluted before reaching our hands) . In contrast, 
strain M2126 carrying plasmid pPR1990 (carrying a 
flagellin gene from the H12 type strain. Table 3 A) 
agglutinates with anti-Hl2 serum when the antiserum is 
diluted at 1:100, but agglutinates only with undiluted 
anti-Hl serum and not at a 1:10 dilution. Thus, we 
conclude that the flagellin genes we sequenced from type 
strains for HI and H12 encode flagellins of Hi and H12 
specificities respectively. 

10.3. Flagellin gene coding for HIS: 

Strain P5560 carrying plasmid pPRl969 agglutinated 
with anti-H16 serum. pPR1969 carries a flagellin gene 
amplified from the H3 type strain. It has been shown that 
this H3 type strain is a biphasic strain which can express 
H3 and H16 specificities [Ratiner, Y. A. (1985) ^^Two 
genetic arrangements determining flagellar antigen 
specificities in two diphasic E. coli strains. FEMS 
Microbiol Lett 19: 317-323]. Thus, the H3 type strain has 
two flagellin genes coding for H3 and H15 specificities. 
We conclude that we have cloned and sequenced the H16 
flagellin gene from this H3 type strain. 

10.4 Flagellin gene coding for H4 : 

The flagellin genes obtained from type strains for H4 
and H17 are nearly identical, with 4 a. a. difference in 
the gene products. Plasmid pPRl955 carries a flagellin 
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gene from the H4 type strain, and plasmid pPR1957 carries 
a flagellin gene from the H17 type strain. Strain P5560 
carrying plasmid pPR1955 or plasmid pPR1957 agglutinated 
with anti-H4 ser\im, but not with anti-H17 serxun. It has 
been shown that the type strain for H17 is a biphasic 
strain which can express H17 and H4 [Ratiner, Y. A. (1985) 
''Two genetic arrangements determining flagellar antigen 
specificities in two diphasic E. coli strains. FEMS 
Microbiol Lett 19: 317-323]. The flagellin gene obtained 
from type strain for H44 is also highly similar to that 
obtained from the H4 type strain, with 2 a. a, difference 
in the gene products. It has been shown that the H44 type 
strain has two complete flagellin genes, being H4 and H44 
[Ratiner, Y. A. (1998) ''New flagellin specifying genes in 
some E. coli strains" J. Bacterid 180: 979-984]. Thus, we 
conclude that all the three flagellin genes (obtained from 
type strains for H4, H17 and H44, and sequenced) encode 
the H4 flagellin, and that the flagellin genes for H17 and 
H44 specificities have not yet been cloned. 

10.5 Flagellin gene coding for HlO : 

The flagellin genes obtained from type strains for 
HlO and H50 are nearly identical, with 3 a. a. difference 
in the gene products. Strain P5560 carrying plasmid 
pPR1923 (which carries a flagellin gene from the HlO type 
strain) agglutinated with anti-HlO serum. We conclude that 
the sequence obtained from the HlO type strain encodes the 
HlO flagellin. It is not clear if the sequence obtained 
from the HBO type strain encodes HlO or H50 (see below 
section for HBO) . 

10.6 Flagellin gene coding for H38: 

The flagellin genes obtained from type strains for 
H3 8 and HBB are nearly identical, with only 1 a. a. 
difference in the gene products. Strain M2126 carrying 
plasmid pPR1984 (carrying the flagellin gene from the type 
strain H3 8) agglutinated with anti-H3 8 serum, but not with 
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anti-H55 serum. It also has been shown that the type 
strain for H55 has two complete flagellin genes coding for 
H55 and H38 specificities [Ratiner, Y, A. (1998) "New 
flagellin specifying genes in some E, coli strains" J. 
5 Bacterid 180: 979-984]. Thus, we conclude that both 

cloned genes encode the H38 flagellin. 

10 .7 Summary: 

Flagellin genes coding for 3 9 H antigens have been 
10 identified, being those for specificities Hi, H2, H4, H5, 

H6, H7, H9, HIO, Hll, H12 , H14, H15, H16, H18, H19, H20, 
H21, H23, H24, H26, H27, H28, H29, H30, H31, H32, H33 , 
H34, H3 8, H39, H41, H42 , H43 , H45, H46, H49, H51, H52, and 
H56. 

15 

II- Comparison and alignment of the flagellin genes: 

Programs Pileup [Devereux, J., Haeberli, P. and 
Smithies, O. : A comprehensive set of sequence analysis 
programs for the VAX. Nucl . Acids Res. 12 (1984) 3 87- 
20 395]and Multicomp [Reeves, P.R. , Farnell, L. and Lan, R. : 

MULTICOMP: a program for preparing sequence data for 
phylogenetic analysis. CABIOS 10 (1994) 281-284] were 
used. 

The previously published sequence of HI (GenBank 
25 accession number L07387) was extracted from GenBank and 

used. Because we did not sequence H36 and H53 flagellin 
genes and we did not have the HI 5 type strain, we only 
compared 51 flagellin genes of H type strains and the fliC 
genes from the additional 10 H7 strains. 
30 Among the H7 fliC genes, the percentage of DNA 

difference ranged from 0.0 to 2.39%. The flagellin genes 
from type strains for H40 and H8 are identical. Some 
others are nearly identical: H21 and H47 (1.5% 
difference), H12 and HI (2.6% difference), HIO and H50 
35 (0.3% difference), H38 and H55 (0.1% difference), H4, H44 

and H17 are very similar, the pairwise difference ranging 
from 0.33% to 0.87%. 



• 
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For the flagellin genes obtained from type strains 
for H4, H17 and H44, we have shown that all the three 
genes encode flagellin with the H4 specificity (see 



strains fro H21 and H47, and H38 and H55, we have 
confirmed the specificities for one for each pair and have 
good reason to conclude that both genes of each pair 
encode the same H specificity (see above section) , being 
that for H21 and H3 8 specificities respectively. 

For the flagellin genes obtained from type strains 
for HID and H50, we have confirmed that the one from the 
HIO type strain encodes HIO specificity. As these two 
genes are highly similar, we have presumed that they both 
encode HIO specificity. 

In the cases where the flagellin gene from two type 
strains is near identical, we conclude that both genes 
code for flagellin of the same H specificity and that one 
or other strain has an additional locus which carries the 
functional gene, although the flagellin genes sequenced 
do not appear to be mutated. 

We have shown by cloning and expression that the 
flagellin genes obtained from the HI and H12 type strains 
encode HI and H12 specificities respectively (see above 
section) . The neucleotide difference between these two 
genes is higher at 2.6% (see above), but still within the 
normal range for variation within a gene in E. coli. The 
two antigens cross react, and this cross reaction must be 
due to the high level similarity of the flagellins 
encoded by these two genes . 

As discussed above, genes encoding some H antigens 
have been shown to be located at loci other than fllC, H3, 
H3 6, H47, H53 have been shown to be at a locus called 
flkA, H44 and H55 at fllA, and H54 at flmA [Ratiner Y A 
(1998) "New f lagellin-specif ying genes in some Escherichia 
coli strains" J. Bacterid. 180 979-984]. However, these 
strains may carry a fliC in addition to flkA, fllA or flmA 
[Ratiner Y A (1998 ) "New f lagellin-specif ying genes in some 



above) . 



For the flagellin genes obtained from type 
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Escherichia coli strains" J. Bacteriol. 180 979-984]. 

The flagellin gene encoding H48 was previously 
sequenced from E. coli strain K-12 [Kuwajima G, Asaka J, 
Fujiwara T, Node K and Kondo E "Nucleotide sequence of the 
5 hag gene encoding flagellin of Escherichia coli'' J 

Bacteriol. 168 (1986) 1479-1483]. We have sequenced the 
flic gene from the H48 type strain, and found that it is 
identical to that from K-12 . 

The H54 gene is Icnown to be at flmA [Ratiner Y A 
10 (1998) "New f lagellin-specif ying genes in some Escherichia 

coli strains" J. Bacteriol. 180 979-984] 

and the finding of a non- functional presumptive flic locus 
in the H54 strain shows that it is present but not 
expressed. However, we have not amplified and sequenced 
15 the functional flmA gene of this strain. 

Using the 43 unique sequences (being the 39 
identified genes with confirmed specificities and the 
flagellin genes obtained from the H8 (or H40), H25, H37, 
and H48 type strains) and the sequences from the two non- 
20 functional flagellin genes (from H type strains H35 and 
H54) (see Table 3) we have been able to determine antigen 
specific primers for each of the H antigen specificities 
and thereby show that it is practicable to detect E.coli 
strains carrying specific H antigens without false 
25 positives from strains of other H types. There is no 
reason to expect that the addition of 11 sequences to the 
43 unique sequences obtained will affect the general 
conclusion, as unli]ce previous reports, our study covers 
flagellin sequences for a substantial majority of Icnown E. 
30 coli H antigen specificities. 

Our study of 11 H7 genes from strains of eight 
different O antigens shows limited variation which was 
such that the variation within genes for H antigens does 
not affect the ability to select antigen specific primers. 
35 0:H combinations in general define a strain and as some of 

the strains thus defined were quite distant from each 
other in a study by Whittam [Whittam T S, wolfe M L, 
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Wachsmuth I K, Orskov I and Wilson R A "Clonal 
relationships among Escherichia coli strains that cause 
hemorrhagic colitis and infantile diarrhea" Infect. Iinmun. 
61 (1993) 1619-1629] the variation we observe is thought 
5 to represent that generally present in H7 genes. We also 

obtained more than one sequences for f lagellin genes for H 
specificities H4 , HIO, and H38, and again the level of 
variation within a given specifities is very low. 
However, there is a low possibility that primers chosen 

10 without knowledge of the variation within genes of each H 

specificity could fail to give positive results with some 
isolates due to chance choice of primers which cover a 
base or bases which contribute to this low level 
variation. The variation within the H7 genes is in the 

15 normal range for variation within a gene in coli and if 

this possibility did occur it would be easy to use an 
alternate primer pair. For example, if a first primer in 
a primer pair is unable to hybridise to a target region 
because of low level variation in that region, a positive 

20 result may be achieved by using a second primer in that 

pair together with a third primer, whether or not the 
third primer is specific for the f lagellin gene. Where 
the third primer is not specific for the f lagellin gene, 
the specificity of the primer pair derives from the 

25 specificity of the second primer. The observation that 

the overall level of variation within gene for a given H 
specificity is very low making it extremely unlikely that 
the regions covered by the two primers specific for H 
specificity would both have undergone change in the same 

30 strain. 

There are 54 known H antigens for E. coli and of 
these there are 11 H antigen specificities for which we do 
not as yet have sequence. It will be easy to determine 
these sequences and determine primer pairs specific for 
35 these H antigens by comparing these sequences with the 45 

obtained sequences (see Table 3) , and also modify the 
primers selected for any H antigen for which we already 
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know the sequence in the unlikely event that there is a 
possibility of false positives with the primers selected. 

The sequences for the remaining H antigens can be 
obtained in one of the following ways: 

1. where we have two bands by PGR (H3 6 and H53 type 
strains) , we purify each and sequence, and also clone each 
into a strain mutated in its fliC gene and determine the H 
antigen expressed by use of specific sera. In this way a 
specific sequence can be related to an H antigen 
specificity. The other band which represents an H antigen 
gene for a different specificity is expected to include a 
mutant gene or a gene similar to one of those for a known 
H specificity, but if not may represent a new specificity 
for which primer pairs could be selected. It may be 
difficult to obtain expression of flagellin genes when 
cloned from E. coli due to cloning together with 
regulatory sequences which prevent expression. This is 
easily avoided by cloning the major segment of the gene 
into a functioning fliC gene to replace the equivalent 
segment of that gene, using standard site directed 
mutagenesis to give suitable restriction sites within the 
cloned gene and incorporating those restriction sites into 
primers used to amplify the major segment of the gene to 
be studied to facilitate the cloning. We have cloned and 
sequenced the PGR bands from the H3 6 and the H55 type 
strains using this method (see section 16) . 

2. Where two or more strains have the same flagellin 
gene sequence, the genes are cloned as above and the H 
antigen specificity represented by this sequence is 
determined. This identifies the strain in which the 
expected gene is expressed and also those strains for 
which we have sequenced a gene which is not being 
expressed. We then clone the gene for the antigen 
expressed in these strains by making a bank of plasmid 
clones using chromosomal DNA and select for a clone which 
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is expressing an H antigen different from the one 
represented by the known sequence. This can be done by 
taking advantage of the fact that the H antigen is on 
flagellin, the protein of the bacterial flagellum used for 
5 movement of the bacteria. In the presence of antibodies 

specific to that flagellum the bacteria cannot swim. For 
selection the clones are placed in a situation in which 
motile cells can swim away from the others and be 
collected. There are many versions of these techniques 

10 and any could be used. One version is to place the 

bacteria on a nutrient agar plate with reduced agar 
content such that bacteria can swim away from the site of 
inoculation. This is easily seen as growth on the plate 
and a sample of the bacteria which are motile can be 

15 recovered and cultivated. In this way bacteria carrying 

cloned H antigen genes can be selected. If the medium in 
the plate has antibody added to it only bacteria which 
express an H antigen different to that recognised by the 
antiserum will be able to swim. Specifically if the 

20 antiserum used is specific for the H antigen expressed by 

the gene for which we have sequence, only clones which 
express a different H antigen, such as those expressing 
the H antigen expressed by the H type strains used to make 
the plasmid, will be selected. Once the clone is 

25 obtained, the H antigen gene can be sequenced. 

Our work has shown that there are at least 7 cases 
where the H antigen type strains carry two H antigen genes 
which appear to be complete and have the potential to 

30 function. However, while E. coll does not (in general) 

have a capacity to express more than one flagellin gene, 
it is striking that there are several loci for flagellin 
genes [Ratiner Y A (1998) "New f lagellin-specifying genes 
in some Escherichia, coli strains" J. Bacterid. 180 979- 

35 984] . Several of the pairs of H type strains with 

identical or near identical sequence do not include any of 
the H antigen types shown by Ratiner [Ratiner Y A 
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(1998) "New f lagellin-specifying genes in some Escherichia, 
coll strains" J. Bacteriol, 180 979-984] to map other than 
at flic although these predominate. This suggests that 
there are additional cases where the expressed gene is not 
the only flagellin gene present. However the fact that 
many of the cases where we obtained flagellin genes of 
identical or near identical sequence and/or two flagellin 
genes from one strain involve type strains found by 
Ratiner [Ratiner Y A (1998) "New f lagellin-specifying genes 
in some Escherichia coli strains" J. Bacterid. 180 979- 
984] to map away from fliC are among those near identical 
to others, indicates that the phenomenon is of limited 
extent. Nonetheless it remains possible even where only 
one gene has been obtained by PGR, that it is one of a 
pair of flagellin genes, the other not being amplified by 
the primers used, and further that it is the one not 
amplified which is expressing the H antigen of the strain. 

It will therefore be necessary to clone as described 
above each of the flagellin genes we have sequenced and 
confirm that it expresses the expected antigen to ensure 
that the invention give results corresponding to those of 
the traditional serotyping scheme. In the event that it 
does not, the gene for the type antigen can be cloned and 
sequenced by the means described above. 

The 11 H7 flic sequences fell into three groups, one 
comprising the genes from the 0157 :H7 and 055 :H7 strains, 
which were identical, as expected given the proposed 
relationship between the clones. It has been shown that E. 
coli 0157:H7 and 055:H7 clones are closely related 
[Whittam T S, wolfe M L, Wachsmuth I K, Orslcov I and 
Wilson R A "Clonal relationships among Escherichia coli 
strains that cause hemorrhagic colitis and infantile 
diarrhea" Infect, Immun. 61 (1993) 1619-1629] thus it was 
expected that the H7 flic genes from 0157 and 055 would be 
identical. Among the H7 flic sequences, we can identify 
primers specific to the H7 fliC gene for each of the three 
H7 groups. Two of these primers in combination with an H7 
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specific primer gave two primer pairs specific for the H7 
gene of from the 0157 :H7 and 055 :H7 clones. 

13. Specific oligonucleotide primers for each of the 43 
5 flagellin genes 

Two oligonucleotide primers were chosen based on each 
of the 43 sequences. None of them had more than 85% 
identity with any other of 61 flagellin gene sequences. 
Thus, these primers are specific for each H type. These 

10 primers are listed in Table 3 . 

The flagellin gene of the H54 type strain is a 
mutated gene. It has an insertion sequence (IS1222) 
inserted into a normal flagellin gene of H21. Thus, 
primers for H21 would amplify a fragment of different size 

15 in H54. We also provide 2 primers based on the insertion 

sequence (see H54 row in Table 3), and the use of one of 
them in combination with one of the H21 primers will 
generate a PGR band only in H54 , which will also 
differentiate those strain carrying the mutated H21 gene 

20 from those expressing the H21 flagellin gene. 

The flic gene of H35 type strain is also a mutated 
gene. It has an insertion sequence (ISl) inserted into a 
normal flagellin gene of Hll. Thus, primers for Hll would 
amplify a fragment of different size in H35. We also 

25 provide 2 primers based on the insertion sequence (see H3 5 

row in Table 3), and the use of one of them in combination 
with one of the Hll primers will generate a PGR band only 
in H35, which will also differentiate those strain 
carrying the mutated Hll gene from those expressing the 

30 Hll flagellin gene. 



14. Testing of the H7 specific oligonucleotide primers 

Primer pair #1806/#1809 (see Table 3) was used to 
carry out PGR on chromosomal DNA samples of all the 54 H 
type strains and the H7 strains listed in Table 1. PGR 
reactions were carried out under the following conditions: 
denaturing, 94^G/30'; , annealing, 58°G/30'; extension. 
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72°C/1'; 30 cycles. PGR reaction was carried out in an 
voliime of 50ul for each of the chromosomal sample. After 
the PGR reaction, 5^il PGR product from each sample was run 
on an agarose gel to check for amplified DNA. 

Primer pairs #1806/#1809 produced a band of predicted 
size with all the 11 strains expressing H7, but gave no 
band with other H type strains. Thus, these primers are H7 
specific . 

15. Testing of oligonucleotide primers specific to H7 of 
0157 and 055: 

Based on a comparison of the fliC sequences of 11 
different H7 strains, we have identified two 
oligonucleotides [#1696 (5 ' -GGCGTGAGTGAGGGGGCG) at 

positions 178 to 195 in M527 and #1697 (5'- 
GAGTTACCGGCCTGCTGA) positions 1700-1683 in M527] which are 
unique to H7 of 0157 and 055. Although not identical to 
any parts of the fliC sequences of any other H7 strains, 
these two primers are identical or have high level 
similarity to fliC genes of some other H types. However a 
combination of one of these primers with one of the H7 
specific primers can give specificity for H7 of 0157 :H7 
and 055 :H7 E. coli. 

Primer pairs #1696/#1809 and #1697/#1806 were used to 
carry out PGR on chromosomal DNA samples of all the H 
type strains and the H7 strains listed in Table 1. PGR 
reactions were carried out under the following conditions: 
denaturing, 94^G/30'; annealing, ei^G/SO' (for 

#1696/#1809) or 60^0/30 ' (f or#1697/#1806) ; extension, 
72^C/1'; 30 cycles. PGR reaction was carried out in an 
volume of 50|ll for each of the chromosomal samples. After 
the PGR reaction, 5|ll PGR product from each sample was run 
on an agarose gel to check for amplified DNA. 

Both primer pairs produced a band of predicted size 
with both of the 0157 :H7 strains (strains M1004 and M527, 
see Table 1), and the 055:H7 strain (strain M1686, see 
Table 1), but gave no band with other strains. Thus, these 
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two pairs of primers are specific to H7 genes of 0157 :H7 
and 055:H7 E. coli strains. 

16. Identification of flagellin genes for the remaining 15 
H specificities . 

16.1. Sequencing the potential flkA gene coding for the 
H36 flagellin: 

Using primers #1431 (5'- atg gca caa gtc att aat acc 
caa c) and #1432 (5'- eta acc ctg cag cag aga ca) , we have 
amplified two bands from the H3 6 type strain. PGR reaction 
was carried out under the following conditions: 
denaturing, 94oC/30' ; annealing, 57oC/30' ; extension, 
72oC/l'; 30 cycles. These two PGR fragments were then 
cloned into the pGEM-T vector using the Promega pGEM-T 
cloning kit (Madison WI USA) to make plasmids pPR1992 and 
PPR1993 . Inserts from both plasmids were first sequenced 
using the M13 universal primers (which bind to the pGEM-T 
DNA flanking the insertion site). For pPRl992, primers 
based on the sequence obtained were then used to sequence 
further, and this procedure was repeated until the insert 
was fully sequenced. 

The sequence of the insert of pPR1992 is identical to 
that of the H12 flagellin gene sequence except perhaps for 
the first 8 and last 7 codons which are encoded by the PGR 
primers in plasmid pPRl992, We have only sequenced the 
two ends of the insert of plasmid pPR1993 (Figures 71 and 
72), and the sequences of the two ends of the insert of 
pPR1993 are very similar to ends of other sequenced 
flagellin genes. We conclude that the insert of plasmid 
PPR1993 encodes a flagellin gene. The full sequence of 
the insert of plasmid pPRl993 can be obtained using the 
same method as for the sequencing of the insert of plasmid 
PPR1992. It is known that flkA gene encodes the H36 

flagellin [Ratiner, Y. A. (1998) ^^New flagellin specifying 
genes in some E. coli strains" J. Bacterid 180: 979-984], 
and it is highly likely, that plasmid pPR1993 contains the 
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flkA gene of the H3 6 type strain. H specificities can be 
confirmed by slide agglutination. 

The currently uncharacterised sequence of both ends 
and of DNA flanking these two sequenced genes can be 
obtained by PGR walking and sequencing. Methods for PGR 
walking from a known sequence to an unknown region in 
chromosomal DNA are available (see [Siebert, P. D. , A. 
Chenchi, D. E. Kellogg, A. Lukyanov and S. A. Lukyanov 

(1995) "An improved PGR method for walking in uncloned 
genomic DNA." Nuc . Acids Res. 23: 1087-1088]). 

The sequenced genes then can be PGR amplified and 

cloned using the method(s) described in section 9. 

Flagellins expressed by strain M2126 carrying these 

plasmids then can be determined by use of specific sera. 

The sequences flanking the flkA gene can then be used 

to PGR amplify other flkA genes (see below) . 

16.2 The flkA genes coding for H3, H47 and H53 : 

It has been shown that flagellins H3 , H47 and H53 are 
encoded by flkA genes in the type strains [Ratiner, Y. A. 
(1998) "New flagellin specifying genes in some E. coli 
strains" J. Bacteriol 180: 979-984]. These genes can be 
PGR amplified using primers based on the sequences 
flanking the flkA gene in the H36 type strain. These PGR 
fragments can then be sequenced, and the genes expressed 
in strain M2126 for the identification of these genes. 

16.3 The fllA genes coding for H44 and H55: 

It is known that flagellins H44 and H55 are coded by 
fllA genes. 

16.3.1 The H55 flagellin gene: 

Using primers #1868 and #1870 (Table 3B) , we have 
amplified two bands from the H55 type strain. PGR reaction 
was carried out under the following conditions: 
denaturing, 94oC/30'; annealing, 50oC/30'; extension, 
720G/1'; 30 cycles. These two PGR fragments were then 
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Cloned into the pGEM-T vector using the Promega pGEM-T 
cloning kit (Madison WI USA) to make plasmids pPRl994 and 
PPR1989. inserts from both plasmids were first sequenced 
using the M13 universal primers (which bind to the pGEM-T 
DNA flanking the insertion site). Primers based on the 
sequence obtained were then used to sequence further, and 
this procedure was repeated until both inserts were fully 
or partly sequenced. 

The sequence of the insert of pPRl994 is highly 
similar to that of the flagellin gene of the H38 type 
strain, with 1 amino acid difference in the gene products, 
we have only sequenced the two ends of the insert of 
plasmid PPR1989 (figures 70A and 70B) , and the sequences 
of the two ends of the insert of pPR1989 are very similar 
to ends of other sequenced flagellin genes. We conclude 
that the insert of plasmid pPRl989 encodes a flagellin 
gene. The full sequence of the insert of plasmid pPRl989 
can be obtained using the same method as for the 
sequencing of the insert of plasmid pPRl994 . It is known 
that the H55 type strain carries flagellin genes for both 
H3 8 and H55, and that the H55 flagellin gene is at the 
fUA locus [Ratiner, Y. A. (1998) "New flagellin 
specifying genes in some E. coli strains" J. Bacterid 
180: 979-984]. Thus, it is highly likely that plasmid 
PPR1989 contains the fllA gene of the H55 type strain. 

The currently uncharacterised sequence of both ends 
and of DNA flanking these two sequenced genes can be 
obtained by PGR walking and sequencing. Methods for PGR 
walking from a known sequence to an unknown region in 
chromosomal DNA are available (see [Siebert, P. D. , A. 
Chenchi, D. E. Kellogg, A. Lukyanov and S. A. Lukyanov 
(1995) «An improved PGR method for walking in uncloned 
genomic DNA." Nuc . Acids Res. 23: 1087-1088]). 

The sequenced genes then can be PGR amplified and 
cloned using the method (s) described in section 9. 
Flagellins expressed by strain M2126 carrying these 
plasmids then can be determined by use of specific sera. 
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16.3.2 The H44 flagellin gene: 

The sequence information for DNA flanking the fllA 
gene in the H55 type strain can then be used to PGR, 
sequence and identify the fllA gene in the H44 type 
strain . 

16 .4 The flmA gene coding for H54 : 

This gene can be cloned by making a bank of plasmid 
clones in strain M212 5 using chromosomal DNA of the H54 
type strain and selecting for a transformant which is 
motile on an agar plate. This is done by taking advantage 
of the fact that the H antigen is on flagellin, the 
protein of the bacterial flagellxim used for movement of 
the bacteria. Strain M212 6 lacks flagellin. Once the 
clone (s) is obtained and identified by use of anti-H54 
serum, the flagellin gene can be sequenced. It is possible 
that clones expressing different flagellin specificities 
can be obtained, and each of them can be identified by 
using different sera. 

16.5 The flagellin genes ohtained from the H37 
and H48 type strains: 

We have used primers #1868 and #1869 (both were based 
on the sequence obtained from the H48 type strain, also 
see section 9) and primers #1868 and #1870 (both were 
based on the sequences of the H7 flagellin gene of the H7 
type strain, also see section 9) to PGR amplify and clone 
the sequenced flagellin genes from the H48 and H37 type 
strains respectively. Strain P5560 carrying the plasmid 
containing either the cloned gene was not motile and did 
not react with the appropriate antisera. It is highly 
likely that mutaions have occured due to PGR errors. This 
can be resolved by re-amplification and re-cloning of the 
genes . 

16.6 The flagellin gene obtained from the H25 type 
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Strain: 

The flagellin gene sequence we first obtained from 
the H25 type strain lacks 23 and 21 codons at 5' and 3' 
ends respectively. We could not amplify the full gene from 
the H25 type strain using primers based on the H7 
flagellin gene of the H7 type strain, and it was necessary 
to get the full sequence of this flagellin gene by other 
means . 

We have used primers (#2650: 5' - cag cga tga aat act 
tgc cat and #2648: 5' - caa tgc ttc gtg acg cac) based on 
the genes (fliD and fliA respectively) flanking flic gene 
in E. coli K-12 [Blattner, F. R. , G. I. Plunkett, C. A. 
Bloch, N. T. Perna, V. Burland, M. Riley and et al . (1997) 
"The complete genome sequence of E. Coli Kil2" Science 
277: 1453-1474] and primers (#2658: 5' - gcc tga gtc aga 
cct ttg and # 2653 5' - aac ctg tct gaa gcg cag) based on 
the flagellin sequence obtained from the H25 type strain 
to PGR amplify both ends of the flagellin gene. The PGR 
product was then sequenced, and we have now obtained the 
full flagellin gene sequence and sequence for the DNA 
flanking the flagellin gene from type strain H25 (Figure 
69). NOW, it is straightforward to PGR amplify, clone and 
express, and identify this gene using the methods 
described in sections 9 and 10 . 

25 7 The flagellin genes obtained from the H8 

and H40 type strains: 

The flagellin gene sequences obtained from both the 
H8 and H40 type strains lack 18 and 15 codons at 5 ' and 3 • 
ends respectively. We have used primers based on the H7 
flagellin gene of the H7 type strain to PGR amplify and 
clone the full genes from these two strains. Strain M2126 
carrying plasmid made this way was not motile under 
microscope and did not react with the appropriate 
antisera. This could be due to PGR errors as mentioned in 
section 16.5 or perhaps the first and last few amino acids 
encoded by the primers (based on H7 flagellin gene) are 
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uncompatible in this case. 

The full sequence of the full gene can be obtained 
using method described in section 16.6. The flagellin gene 
can then be PGR amplified, cloned and expressed, and 
identified using the methods described in sections 9 and 
10 . 

The gene products of the flagellin genes obtained 
from the H8 and H40 type strains are identical. Thus, one 
of these two H specificities must be encoded by a unknown 
gene, and it can be cloned and identified using the method 
described in the section 16.8. 

25. g Flagellin genes coding for H17, H35, and 

H50: 

As mentioned above, the sequenced flagellin genes 
from the H17 and H50 type strains encode H4 and HIO 
specificities respectively. The flagellin gene sequence 
obtained from the H35 strain has a insertion and encodes a 
non- functional gene (see section 8). Thus, genes coding 
for these flagellins have not been identified, and their 
location is unknown. One can use primers based on DNA 
flanking flic, fllA, flkA, and flmA to do PGR on the type 
strain for each of the flagellin antigen. PGR products can 
then be sequenced, and possible genes can be cloned, 
expressed and identified then. 

If the target gene is not PGR amplified using primers 
based on sequence of these loci or sequence flanking these 
loci, it can be cloned by making a bank of plasmid clones 
in strain M2126 using chromosomal DNA of the type strain 
and selecting for a transformant which is motile on an 
agar plate. This is done by taking advantage of the fact 
that the H antigen is on flagellin, the protein of the 
bacterial flagellum used for movement of the bacteria. 
Strain M2126 lacks flagellin. Once the clone (s) is 
obtained and identified by use of antisera, the flagellin 
gene can be sequenced. It is possible that clones 
expressing different flagellin antigens can be obtained. 
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and each of them can be identified by using different 
antisera. Antiserum for HBO can be prepared using 
s t andard methods [ Ewing , W . H . : Edwards and Ewing ' s 
identification of the finteroibacteriaceae. , Elsevier 
Science Publishers, Amsterdam, The Netherlands, 1986]. 

O antigen 

Materials and Methods-part 1 

The experimental procedures for the isolation and 
characterisation of the E. coli 0111 O antigen gene 
cluster (position 3,021-9,981) are according to Bastin 
D.A., et al. 1991 "Molecular cloning and expression in 
Escherichia coli K-12 of the rfb gene cluster determining 
the O antigen of an E. coli 0111 strain". Mol . Microbiol. 
5:9 2223-2231 and Bastin D.A. and Reeves, P.R. 1995 
"Sequence and analysis of the O antigen gene (rfb) cluster 
of Escherichia coli Olll" . Gene 164: 17-23. 

A. Bacterial strains and growth media 

Bacteria were grown in Luria broth supplemented as 

required. 

B. Cosmids and phage 

Cosmids in the host strain x2819 were repackaged in 
vivo. Cells were grown in 250mL flasks containing 30mL of 
culture, with moderate shaking at 30°C to an optical 
density of 0.3 at 580 nm. The defective lambda prophage 
was induced by heating in a water bath at 45 °C for 15min 
followed by an incubation at 37 °C with vigorous shaking 
for 2hr. Cells were then lysed by the addition of 0 . 3mL 
chloroform and shaking for a further lOmin. Cell debris 
were removed from ImL of lysate by a 5min spin in a 
microcentrifuge, and the supernatant removed to a fresh 
microfuge tube. One drop of chloroform was added then 

shaken vigorously through the tube contents. 

C. DNA preparation 

Chromosomal DNA was prepared from bacteria grown 

overnight at 37°C in a volume of 3 0mL of Luria broth. 

After harvesting by centrifugation, cells were washed and 
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resuspended in lOmL of 50mMTris-HCl pH 8.0. EDTA was 
added and the mixture incubated for 2 0min. Then lysozyme 
was added and incubation continued for a further lOmin. 
Proteinase K, SDS, and ribonuclease were then added and 
the mixture incubated for up to 2hr for lysis to occur. 
All incubations were at 37^C. The mixture was then 
heated to 65 °C and extracted once with 8mL of phenol at 
the same temperature. The mixture was extracted once 
with 5mL of phenol /chloroform/iso-amyl alcohol at 4°C. 
Residual phenol was removed by two ether extractions. 
DNA was precipitated with 2 vols, of ethanol at 4°C, 
spooled and washed in 70% ethanol, resuspended in l-2mL 
of TE and dialysed. Plasmid and cosmid DNA was prepared 
by a modification of the Birnboim and Doly method 
[Birnboim, H. C. and Doly, J. (1979) "A rapid alkaline 
extraction procedure for screening recombinant plasmid 
DNA" Nucl. Acid Res. 7:1513-1523]. The volume of culture 
was lOmL and the lysate was extracted with 
phenol /chloroform/iso-amyl alcohol before precipitation 
with isopropanol. Plasmid DNA to be used as vector was 
isolated on a continuous caesium chloride gradient 
following alkaline lysis of cells grown in IL of culture. 

D, Enzymes and buffers. 

Restriction endonucleases and DNA T4 ligase were 
purchased from Boehringer Mannheim (Castle Hill, NSW, 
Australia) or Pharmacia LKB (Melbourne, VIC Australia). 
Restriction enzymes were used in the recommended 
commercial buffer. 

E. Construction of a gene bank. 

Individual aliquot s of M92 chromosomal DNA (strain 
Stoke W, from Statens Serum Institut, 5 Artillerive j , 
23 00 Copenhagen S, Denmark) were partially digested with 
0.2U Sau3Al for l-15mins. Aliquots giving the greatest 
proportion of fragments in the size range of 
approximately 40-50kb were selected and ligated to vector 
pPR691 previously digested with SamHl and PvuII . Ligation 
mixtures were packaged in vitro with packaging extract. 
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The host strain for transduction was x2819 and 
recombinants were selected with kanamycin. 
F. Serological procedures. 

Colonies were screened for the presence of the 0111 
antigen by immunoblotting. Colonies were grown 

overnight, up to 100 per plate then transferred to 
nitrocellulose discs and lysed with 0.5N HCl . Tween 20 
was added to TBS at 0.05% final concentration for 
blocking, incubating and washing steps. Primary antibody 
was E. coli O group 111 antiserxim, diluted 1:800. The 
secondary antibody was goat anti-rabbit igG labelled with 
horseradish peroxidase diluted 1:5000. The staining 
substrate was 4-chloro-l-napthol . Slide agglutination 
was performed according to the standard procedure. 
G. Recombinant DNA methods. 

Restriction mapping was based on a combination of 
standard methods including single and double digests and 
sub-cloning. Deletion derivatives of entire cosmids were 
produced as follows: aliquots of 1 . 8mg of cosmid DNA were 
digested in a volume of 20ml with 0.25U of restriction 
enzyme for 5-80min. One half of each aliquot was used to 
check the degree of digestion on an agarose gel. The 
sample which appeared to give a representative range of 
fragments was ligated at 4°C overnight and transformed by 
the CaClz method into JM109. Selected plasmids were 
transformed into sfl74 by the same method. P4657 was 
transformed with pPR1244 by electroporation . 
H. DNA hybridisation 

Probe DNA was extracted from agarose gels by 
electroelution and was nick-translated using [a-32Pl- 
dCTP. Chromosomal or plasmid DNA was electrophoresed in 
0.8% agarose and transferred to a nitrocellulose 
membrane. The hybridisation and pre-hybridisation 

buffers contained either 3 0% or 50% formamide for low 
and high stringency probing respectively. Incubation 
temperatures were 42^C and 37°C for pre-hybridisation and 
hybridisation respectively. Low stringency washing of 
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filters consisted of 3 x 20min washes in 2 x SSC and 0.1% 
SDS. High-stringency washing consisted of 3 x Bmin 
washes in 2 x SSC and 0.1% SDS at room temperature, a Ihr 
wash in 1 X SSC and 0.1% SDS at 58°C and 15min wash in 

0. 1 X SSC and 0.1% SDS at 58^C. 

1. Nucleotide sequencing of E, coli 0111 O antigen gene 
cluster (position 3,021-9,981) 

Nucleotide sequencing was performed using an ABI 373 
automated sequencer (CA, USA) . The region between map 
positions 3.30 and 7.90 was sequenced using 
uni-directional exonuclease III digestion of deletion 
families made in PT7T3190 from clones pPR1270 and 
PPR1272. Gaps were filled largely by cloning of selected 
fragments into M13mpl8 or M13mpl9. The region from map 
positions 7.90-10.2 was sequenced from restriction 
fragments in M13mpl8 or M13mpl9 . Remaining gaps in both 
the regions were filled by priming from synthetic 
oligonucleotides complementary to determined positions 
along the sequence, using a single stranded DNA template 
in M13 or phagemid. The oligonucleotides were designed 
after analysing the adjacent sequence. All sequencing 
was performed by the chain termination method. Sequences 
were aligned using SAP [Staden, R., 1982 -Automation of 
the computer handling of gel reading data produced by the 
shotgun method of DNA sequencing". Nuc . Acid Res, 10: 
4731-4751; Staden, R. , 1986 "The current status and 
portability of our sequence handling software". Nuc . Acid 
Res. 14: 217-231]. The program NIP [Staden, R. 1982 "An 
interactive graphics program for comparing and aligning 
nucleic acid and amino acid sequence". Nuc . Acid Res. 
10: 2951-2961] was used to find open reading frames and 
translate them into proteins. 

J. Isolation of clones carrying E. coli Olll O antigen 
gene cluster 

The E, coli O antigen gene cluster was isolated 
according to the method of Bastin D.A. , et al . [1991 
"Molecular cloning and expression in Escherichia coli K- 
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12 of the rfh gene cluster determining the O antigen of 
an E. coli Olll strain". Mol. Microbiol. 5(9), 2223- 
2231] . Cosmid gene banks of M92 chromosomal DNA were 
established in the in vivo packaging strain x2819. From 
the genomic bank, 3.3 x 10^ colonies were screened with 
E.coli 0111 antiserum using an iramuno -blot ting procedure: 
5 colonies (pPR1054, pPR1055, pPR1056, pPR1058 and 
PPR1287) were positive. The cosmids from these strains 
were packaged in vivo into lambda particles and 
transduced into the E. coli deletion mutant Sfl74 which 
lacks all O antigen genes. In this host strain, all 
plasmids gave positive agglutination with 0111 antiserum. 

An Eco Rl restriction map of the 5 independent cosmids 
showed that they have a region of approximately 11.5 kb 
in common (Figure 1) . Cosmid pPR1058 included sufficient 
flanking DNA to identify several chromosomal markers 
linked to O antigen gene cluster and was selected for 
analysis of the O antigen gene cluster region. 
K. Restriction mapping of cosmid pPR1058 

Cosmid PPR1058 was mapped in two stages, A 
preliminary map was constructed first, and then the 
region between map positions 0.00 and 23.10 was mapped in 
detail, since it was shown to be sufficient for Olll 
antigen expression. Restriction sites for both stages 
are shown in Figure 2 . The region common to the five 
cosmid clones was between map positions 1.35 and 12.95 of 
PPR1058 . 

To locate the O antigen gene cluster within pPR1058, 
pPR1058 cosmid was probed with DNA probes covering O 
antigen gene cluster flanking regions from S. enterica. 
LT2 and E.coli K-12. Capsular polysaccharide (cps) genes 
lie upstream of O antigen gene cluster while the 
gluconate dehydrogenase (gnd) gene and the histidine 
{his) operon are downstream, the latter being further 
from the O antigen gene cluster. The probes used were 
pPR472 (3.35kb), carrying the gnd gene of LT2 , pPR685 
{5.3kb) carrying two genes of the cps cluster, cpsB and 
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cpsG of LT2, and K350 (16.5kb) carrying all of the his 
operon of K-12. Probes hybridised as follows: pPR472 
hybridised to 1.55kb and 3.5 kb (including 2.7 kb of 
vector) fragments of Pstl and Hindlll double digests of 
PPR1246 (a Hindlll/ficoRl subclone derived from pPRl058, 
Figure 2), which could be located at map positions 12.95- 
15.1; PPR685 hybridised to a 4.4 kb EcoRl fragment of 
PPR1058 (including 1.3 kb of vector) located at map 
position 0.00-3.05; and K350 hybridised with a 32kb FcoRl 
fragment of pPRl058 (including 4 . Okb of vector), located 
at map position 17.30-45.90. Subclones containing the 
presumed gnd region complemented a grnd-edd' strain 
GB23152. On gluconate bromothymol blue plates, pPR1244 
and PPR1292 in this host strain gave the green colonies 
expected of a gncteddT genotype. The his* phenotype was 
restored by plasmid pPRl058 in the his deletion strain 
Sfl74 on minimal medium plates, showing that the plasmid 
carries the entire his operon. 

It is likely that the O antigen gene cluster region 
lies between gnd and cps, as in other E. coli and S. 
enterica strains, and hence between the approximate map 
positions 3-05 and 12.95. To confirm this, deletion 

derivatives of pPR1058 were made as follows: first, 
PPR1058 was partially digested with Hindlll and self 
ligated. Transf ormants were selected for kanamycin 
resistance and screened for expression of Olll antigen. 
TWO colonies gave a positive reaction. ScoRl digestion 
showed that the two colonies hosted identical plasmids, 
one of which was designated pPRl23 0, with an insert which 
extended from map positions 0.00 to 23.10. Second 
PPR1058 was digested with 5all and partially digested 
with Xhol and the compatible ends were re- ligated. 
Transformants were selected with kanamycin and screened 
for 0111 antigen expression. Plasmid DNA of 8 

positively reacting clones was checked using EcoRl and 
Xhol digestion and appeared to be identical. The cosmid 
of one was designated pPR1231. The insert of pPRl231 
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contained the DNA region between map positions 0.0 0 and 
15.10. Third, pPR1231 was partially digested with Xhol, 
self -ligated, and trans formants selected on 

spectinomycin/ streptomycin plates. Clones were screened 
for kanamycin sensitivity and of 10 selected, all had the 
DNA region from the Xhol site in the vector to the Xhol 
site at position 4.00 deleted. These clones did not 
express the Olll antigen, showing that the Xhol site at 
position 4.00 is within the O antigen gene cluster. One 
clone was selected and named pPRl288. Plasmids pPR123 0, 
PPR1231, and pPR1288 are shown in Figure 2. 

L. Analysis of the E. coli Olll O antigen gene 
cluster (position 3,021-9,981) nucleotide sequence data 

Bastin and Reeves [1995 ^^Sequence and analysis of 
the O antigen gene (rfb) cluster of Escherichia coli Olll". 
Gene 164: 17-23] partially characterised the E.coli Olll 
O antigen gene cluster by sequencing a fragment from map 
position 3,021-9,981. Figure 3 shows the gene 

organisation of position 3,021-9,981 of E. coli Olll O 
antigen gene cluster. orf3 and orf6 have high level 
amino acid identity with i^^caH and wcaG (46.3% and 37.2% 
respectively) , and are likely to be similar in function 
to sugar biosynthetic pathway genes in the E. coli K-12 
colanic gene cluster. orf4 and orf5 show high levels of 
amino acid homology to manC and manB genes respectively. 

orf7 shows high level homology with rfbH which is an 
abequose pathway gene. orfS encodes a protein with 12 
transmembrane segments and has similarity in secondary 
structure to other wzx genes and is likely therefore to 
be the O antigen flippase gene. 

Materials and Methods-part 2 

A. Nucleotide sequencing of 1 to 3,020 and 9,982 to 

14,516 of the E, coli Olll O antigen gene cluster 

The sub clones which contained novel nucleotide 
sequences, pPR1231 (map position 0 and 1,510), pPR1237 
(map position -300 to 2,744), pPR1239 (map position 2,744 
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to 4,168), PPR1245 (map position 9,736 to 12,007) and 
PPR1246 (map position 12,007 to 15,300) (Figure 2), were 
characterised as follows: the distal ends of the inserts 
of PPR1237, PPR123 9 and pPR1245 were sequenced using the 
M13 forward and reverse primers located in the vector. 
PGR walking was carried out to sequence further into each 
insert using primers based on the sequence data and the 
primers were tagged with Ml 3 forward or reverse primer 
sequences for sequencing. This PGR walking procedure was 
repeated until the entire insert was sequenced, pPRl246 
was characterised from position 12,007 to 14,516. The 
DNA of these sub clones was sequenced in both directions. 

The sequencing reactions were performed using the 
dideoxy termination method and thermocycling and reaction 
products were analysed using fluorescent dye and an ABl 
automated sequencer (GA, USA) . 

B. Analysis of the E. coli Olll O antigen gene cluster 
(positions 1 to 3,020 and 9,982 to 14,516 of Figure 5) 
nucleotide sequence data 

The gene organisation of regions of JE:. coii Olll O 
antigen gene cluster which were not characterised by 
Bastin and Reeves [1995 ^^Sequence and analysis of the O 
antigen gene (rfjb) cluster of Escherichia coli Olll." Gene 
164: 17-23] , (positions 1 to 3,020 and 9,982 to 14,516) is 
shown in Figure 3 . There are two open reading frames in 
region 1. Four open reading frames are predicted in 
region 2. The position of each gene is listed in Table 
9. 

The deduced amino acid sequence of orfl {wbdH) 
shares about 64% similarity with that of the rfp gene of 
Shigella dysenteriae. Rfp and WbdH have very similar 
hydrophobic ity plots and both have a very convincing 
predicted transmembrane segment in a corresponding 
position. rfp is a galactosyl transferase involved in 
the synthesis of LPS core, thus whdH is likely to be a 
galactosyl transferase gene. orf2 has 85.7% identity at 
amino acid level to the gmd gene identified in the E. 
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coli K-12 colanic acid gene cluster and is likely to be a 
gmd gene. orf9 encodes a protein with 10 predicted 
transmembrane segments and a large cytoplasmic loop. 
This inner membrane topology is a characteristic feature 
of all known 0 antigen polymerases thus it is likely that 
orf9 encodes an 0 antigen polymerase gene, wzy. orflO 
(wbdL) has a deduced amino acid sequence with low 
homology with Lsi2 of Neisseria gonorrhoeae. Lsi2 is 
responsible for adding GlcNAc to galactose in the 
synthesis of lipooligosaccharide . Thus it is likely that 
WbdL is either a colitose or glucose transferase gene. 
orfll {wbdM) shares high level nucleotide and amino acid 
similarity with TrsE of Yersinia enterocolitica. TrsE is 
a putative sugar transferase thus it is likely that wbdM 
encodes the colitose or glucose transferase. 

in summary three putative transferase genes and an 0 
antigen polymerase gene were identified at map position 1 
to 3,020 and 9,982 to 14,516 of E. coli 0111 O antigen 
gene cluster. A search of GenBank has shown that there 
are no genes with significant similarity at the 
nucleotide sequence level for two of the three putative 
transferase genes or the polymerase gene. Figure 5 
provides the nucleotide sequence of the Olll antigen gene 
cluster . 

Materials and Methods -part 3 

A. PGR amplification of 0157 antigen gene cluster from 
an E. coli 0157:H7 strain (Strain C664-1992, from Statens 
Serum Institut, 5 Artillerivej , 2300, Copenhagen S, 
Denmark) 

E. coli 0157 O antigen gene cluster was amplified by 
using long PGR [Gheng et al . 1994, "Effective 
amplification of long targets from cloned inserts and 
human and genomic DNA" P.N.A.S. USA 91: 5695-569] with 
one primer (primer #412: att ggt age tgt aag cca agg gcg 
gta gcg t) based on the J\impStart sequence usually found 
in the promoter region of O antigen gene clusters [Hobbs, 
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et al , 1994 ''The JiimpStart sequence: a 3 9 bp element 
common to several polysaccharide gene clusters" Mol . 
Microbiol. 12: 855-856], and another primer #482 (cac tgc 
cat acc gac gac gcc gat ctg ttg ctt gg) based on the gnd 
gene usually found downstream of the O antigen gene 
cluster. Long PGR was carried out using the Expand Long 
Template PGR System from Boehringer Mannheim (Castle Hill 
NSW Australia), and products, 14 kb in length, from 
several reactions were combined and purified using the 
Promega Wizard PGR preps DNA purification System (Madison 
WI USA) . The PGR product was then extracted with phenol 
and twice with ether, precipitated with 70% ethanol, and 
resuspended in 40mL of water. 
B. Construction of a random DNase I bank: 

Two aliquots containing about ISOng of DNA each were 
subjected to DNase I digestion using the Novagen DNase I 
Shotgun Cleavage (Madison WI USA) with a modified 
protocol as described. Each aliquot was diluted into 
45ml of 0.05M Tris -HCl (pH7.5), 0.05mg/mL BSA and lOmM 
MnCl2. 5mL of 1:3000 or 1:4500 dilution of DNasel 
(Novagen) (Madison WI USA) in the same buffer was added 
into each tube respectively and 10ml of stop buffer 
(lOOmM EDTA) , 30% glycerol, 0.5% Orange G, 0.075% xylene 
and cyanol (Novagen) (Madison WI USA) was added after 
incubation at IS'^C for 5 min. The DNA from the two DNasel 
reaction tubes were then combined and fractionated on a 
0.8% LMT agarose gel, and the gel segment with DNA of 
about Ikb in size (about 1 . 5mL agarose) was excised. DNA 
was extracted from agarose using Promega Wizard PGR Preps 
DNA Purification (Madison WI USA) and resuspended in 200 
mL water, before being extracted with phenol and twice 
with ether, and precipitated. The DNA was then 

resuspended in 17.25 mL water and subjected to T4 DNA 
polymerase repair and single dA tailing using the Novagen 
Single dA Tailing Kit (Madison WI USA) . The reaction 
product (85ml containing about 8ng DNA) was then 
extracted with chloroform: isoamyl alcohol (24:1) once and 
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ligated to 3x 10'^ pmol pGEM-T (Promega) (Madison WI USA) 
in a total volume of lOOmL. Ligation was carried out 
overnight at 4°C and the ligated DNA was precipitated and 
resuspended in 20mL water before being electroporated 
into E. coli strain JM109 and plated out on BCIG-IPTG 
plates to give a bank. 

C. Sequencing 

DNA templates from clones of the bank were prepared 
for sequencing using the 96-well format plasmid DNA 
miniprep kit from Advanced Genetic Technologies Corp 
(Gaithersburg MD USA) The inserts of these clones were 
sequenced from one or both ends using the standard M13 
sequencing primer sites located in the pGEM-T vector. 
Sequencing was carried out on an ABI377 automated 
sequencer (CA USA) as described above, after carrying out 
the sequencing reaction on an ABI Catalyst (CA USA) . 
Sequence gaps and areas of inadequate coverage were PCR 
amplified directly from 0157 chromosomal DNA using 
primers based on the already obtained sequencing data and 
sequenced using the standard M13 sequencing primer sites 
attached to the PCR primers . 

D. Analysis of the E. coli 0157 O antigen gene cluster 
nucleotide sequence data 

Sequence data were processed and analysed using the 
Staden programs [Staden, R. , 1982 "Automation of the 
computer handling of gel reading data produced by the 
shotgun method of DNA sequencing." Nuc . Acid Res. 10: 
4731-4751; Staden, R. , 1986 "The current status and 
portability of our sequence handling software". Nuc. Acid 
Res. 14: 217-231; Staden, R. 1982 "An interactive 
graphics program for comparing and aligning nucleic acid 
and amino acid sequence". Nuc. Acid Res. 10: 2951-2961]. 

Figure 4 shows the structure of E. coli 0157 O antigen 
gene cluster. Twelve open reading frames were predicted 
from the sequence data, and the nucleotide and amino acid 
sequences of all these genes were then used to search the 
GenBank database for indication of possible function and 
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specificity of these genes. The position of each gene is 
listed in Table 9. The nucleotide sequence is presented 
in Figure 6 , 

orfs 10 and 11 showed high level identity to msinC 
and inanB and were named manC and manB respectively. orfl 
showed 89% identity (at amino acid level) to the gmd gene 
of the E. call colanic acid capsule gene cluster 
(Stevenson G., K. et ai . 1996 "Organisation of the 
Escherichia coli K-12 gene cluster responsible for 
production of the extracellular polysaccharide colanic 
acid". J. Bacteriol. 178:4885-4893) and was named gmd. 
orfS showed 79% and 69% identity (at amino acid level) 
respectively to wcaG of the E, coli colanic acid capsule 
gene cluster and to wbcj {orfl4.8) gene of the Yersinia 
enterocolitica 08 O antigen gene cluster (Zhang, L. et 
al. 1997 "Molecular and chemical characterization of the 
lipopolysaccharide O-antigen and its role in the 
virulence of Y. enterocolitica serotype 08".Mol. 
Microbiol. 23:63-76). Colanic acid and the Yersinia OS O 
antigen both contain fucose as does the 0157 O antigen. 
There are two enzymatic steps required for GDP-L-fucose 
synthesis from GDP-4-keto-6-deoxy-D-mannose , the product 
of the gmd gene product . However, it has been shown 
recently (Tonetti, M et al . 1996 Synthesis of GDP-L- 
fucose by the human FX protein J. Biol. Chem, 271:27274- 
27279) that the human FX protein has "significant 
homology" with the wcaG gene (referred to as Yefb in that 
paper) , and that the FX protein carries out both 
reactions to convert GDP-4-keto-6-deoxy-D-mannose to GDP- 
L-fucose. We believe that this makes a very strong case 
for orf8 carrying out these two steps and propose to name 
the gene fcl. In support of the one enzyme carrying out 
both functions is the observation that there are no genes 
other than manB, manC, gmd and jfcl with similar levels of 
similarity between the three bacterial gene clusters for 
fucose containing structures. 

orfS is very similar to wbeE (rfhE) of Vibrio 
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cholerae Ol , which is thought to be the perosamine 
synthetase, which converts GDP-4-keto-6-deoxY-D-mannose 
to GDP-perosamine (Stroeher, U.H et al. 1995 «A putative 
pathway for perosamine biosynthesis is the first function 
encoded within the rfb region of Vibrio cholsrae" 01. 
Gene 166: 33-42). V. cholerae Ol and E. coli 0157 O 
antigens contain perosamine and N-acetyl -perosamine 
respectively. The V. cholerae 01 manA. manB, gmd and 
wbeE genes are the only genes of the V. cholerae 01 gene 
cluster with significant similarity to genes of the E. 
coli 0157 gene cluster and we believe that our 
observations both confirm the prediction made for the 
function of wbe of V. cholerae, and show that orf5 of the 
0157 gene cluster encodes GDP-perosamine synthetase. 
orfS is therefore named per. orfS plus about lOObp of 
the upstream region (postion 4022-5308) was previously 
sequenced by Bilge, S.S. et al . [1996 "Role of the 
Escherichia coli 0157 -H7 O side chain in adherence and 
analysis of an rfb locus" . Infect . Immun. 64:4795-4801]. 

orfl2 shows high level similarity to the conserved 
region of about 50 amino acids of various meinbers of an 
acetyltransferase family (Lin, W. , et al. 1994 "Sequence 
analysis and molecular characterisation of genes required 
for the biosynthesis of type 1 capsular polysaccharide in 
Staphylococcus aureus". J. Bateriol . 176: 7005-7016) and 
we believe it is the N-acetyltransf erase to convert GDP- 
perosamine to GDP-perNAc. orfl2 has been named wbdR. 

The genes manB. manC, gmd, fcl, per and wbdR account 
for all of the expected biosynthetic pathway genes of the 

0157 gene cluster. 

The remaining biosynthetic step(s) required are for 
synthesis of UDP-GalNAc from UDP-Glc . It has been 
proposed (Zhang, L., et al. 1997 "Molecular and chemical 
characterisation of the lipopolysaccharide O-antigen and 
its role in the virulence of Yersinia enterocolitica 
serotype 08".Mol. Microbiol. 23:63-76) that in Yersinia 
enterocolitica UDP-GalNAc is synthesised from UDP-GlcNAc 
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by a homologue of galactose epimerase (GalE) , for which 
there is a galE like gene in the Yersinia enterocolitica 
08 gene cluster. In the case of 0157 there is no galE 
homologue in the gene cluster and it is not clear how 
UDP-GalNAc is synthesised. It is possible that the 
galactose epimerase encoded by the galE gene in the gal 
operon, can carry out conversion of UDP-GlcNAc to UDP- 
GalNAc in addition to conversion of UDP-Glc to UDP-Gal . 
There do not appear to be any gene(s) responsible for 
UDP-GalNAc synthesis in the 0157 gene cluster. 

orf4 shows similarity to many wzx genes and is named 
wzx and orJf2 which shows similarity of secondary 
structure in the predicted protein to other wzy genes and 
is for that reason named wzy. 

The orfl, orf3 and orf6 gene products all have 
characteristics of transferases, and have been named 
wbdN, wbdO and wbdP respectively. The 0157 O antigen has 
4 sugars and 4 transferases are expected. The first 
transferase to act would put a sugar phosphate onto 
undecaprenol phosphate. The two transferases known to 
perform this function, WbaP (RfbP) and WecA (Rfe) 
transfer galactose phosphate and N-acetyl-glucosamine 
phosphate respectively to undecaprenol phosphate. 
Neither of these sugars is present in the 0157 structure. 

Further, none of the presumptive transferases in the 
0157 gene cluster has the transmembrane segments found in 
WecA and WbaP which transfer a sugar phosphate to 
undecaprenol phosphate and expected for any protein which 
transferred a sugar to undecaprenol phosphate which is 
embedded within the membrane. 

The WecA gene which transfers GlcNAc-P to 
undecaprenol phosphate is located in the Enterobactereal 
Common Antigen (EGA) gene cluster and it functions in EGA 
synthesis in most and perhaps all E. coli strains, and 
also in O antigen synthesis for those strains which have 
GlcNAc as the first sugar in the O unit. 

It appears that WecA acts as the transferase for 
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addition of GalNAc-l-P to undecaprenol phosphate for the 
YGrsinia. enterocolltica. 08 O antigen [Zhang et al.l997 
"Molecular and chemical characterisation of the 
lipopolysaccharide O antigen and its role in the 
virulence of Yersinia entorocolitica serotype 08" Mol . 
Microbiol. 23: 63-76.] and perhaps does so here as the 
0157 structure includes GalNAc . WecA has also been 
reported to add Glucose-l-P phosphate to undecaprenol 
phosphate in E. coli 08 and 09 strains, and an 
alternative possibility for transfer of the first sugar 
to undecaprenol phosphate is WecA mediated transfer of 
glucose, as there is a glucose residue in the 0157 O 
antigen. In either case the requisite number of 

transferase genes are present if GalNAc or Glc is 
transferred by WecA and the side chain Glc is transferred 
by a transferase outside of the O antigen gene cluster. 

orf9 shows high level similarity (44% identity at 
amino acid level, same length) with wcslH gene of the E. 
coli colanic acid capsule gene cluster. The function of 
this gene is unknown, and we give orf9 the name whdQ. 

The DNA between manB and wdhR has strong sequence 
similarity to one of the H-repeat units of E. coli K12 . 
Both of the inverted repeat sequences flanking this 
region are still recognisable, each with two of the 11 
bases being changed. The H-repeat associated protein 
encoding gene located within this region has a 2 67 base 
deletion and mutations in various positions. It seems 
that the H-repeat unit has been associated with this gene 
cluster for a long period of time since it translocated 
to the gene cluster, perhaps playing a role in assembly 
of the gene cluster as has been proposed in other cases. 

Materials and Methods - part 4 

To test our hypothesis that O antigen genes for 
transferases and the wz:k, wzy genes were more specific 
than pathway genes for diagnostic PGR, we first carried 
out PGR using primers for all the E. coli 016 O antigen 
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genes (Table 7) . The PGR was then carried out using PGR 
primers for E.coli 0111 transferase, wz:k and wzy genes 
{Table 8, 8A) . PGR was also carried out using PGR 
primers for the E. coli 0157 transferase, wzx and wzy 
genes (Table 9, 9A) . 

Ghromosomal DNA from the 16 6 serotypes of E. coli 
available from Statens Serum Institut, 5 Artillerivej , 
23 00 Copenhagen Denmark was isolated using the Promega 
Genomic (Madison WI USA) isolation kit. Note that 164 of 
the serogroups are described by Ewing W. H, : Edwards and 
Ewings '^Identification of the Enterobacteriacea" 
Elsevier, Amsterdam 1986 and that they are numbered 1-171 
with numbers 31, 47, 67, 72, 93, 94 and 122 no longer 
valid. Of the two serogroup 19 strains we used 19ab 
strain F8188-41. Lior H. 1994 [ '^Glassif ication of 
Escherichia, coli In Escherichia coli in domestic animals 
and humans pp 31-72. Edited by G.L. Gyles GAB 
international] adds two more numbered 172 and 173 to give 
the 166 serogroups used. Pools containing 5 to 8 samples 
of DNA per pool were made. Pool numbers 1 to 19 (Table 
4) were used in the E. coli 0111 and 0157 assay. Pool 
numbers 20 to 28 were also used in the 0111 assay, and 
pool numbers 22 to 24 contained E. coli 0111 DNA and were 
used as positive controls (Table 5) . Pool numbers 29 to 

42 were also used in the 0157 assay, and pool numbers 31 
to 36 contained E. coli 0157 DNA, and were used as 
positive controls (Table 6) . Pool numbers 2 to 20, 30, 

43 and 44 were used in the E. coli 016 assay (Tables 4 to 
6) . Pool number 44 contained DNA of E. coli K-12 strains 
G600 and WGl and was used as a positive control as 
between them they have all of the E. coli K-12 016 O 
antigen genes . 

PGR reactions were carried out under the following 
conditions: denaturing 94''G/30"; annealing, temperature 
varies (refer to Tables) /30" ; extension, 72''G/1'; 30 
cycles. PGR reaction was carried out in an volume of 
25mL for each pool. After the PGR reaction, lOmL PGR 
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product from each pool was run on an agarose gel to check 
for amplified DNA. 

Each coll chromosomal DNA sample was checked by 

gel electrophoresis for the presence of chromosomal DNA 
5 and by PGR amplification of the E. coll mdh gene using 

oligonucleotides based on E, coll K-12 [Boyd et al . 
( 1994 ) ^^Molecular genetic basis of allelic polymorphism 
in malate degydrogenase (mdh) in natural populations of 
Escherichia, coll and Salmon el la. ent erl ca " Proc . Nat. 
10 Acad. Sci. USA. 91:1280-1284.] Chromosomal DNA samples 

from other bacteria were only checked by gel 
electrophoresis of chromosomal DNA. 

A. Primers based on E. coll 016 O antigen gene cluster 
15 sequence. 

The O antigen gene cluster of E. coll 016 was the 
only typical E. coll O antigen gene cluster that had been 
fully sequenced prior to that of 0111, and we chose it 
for testing our hypothesis. One pair of primers for each 

20 gene was tested against pools 2 to 20, 30 and 43 of E. 

coll chromosomal DNA. The primers, annealing 

temperatures and functional information for each gene are 
listed in Table 8. 

For the five pathway genes, there were 17/21, 13/21, 

25 0/21, 0/21, 0/21 positive pools for rmlB, rmlD, rmlA, 

rmlC and glf respectively (Table 7). For the wzx, wzy 
and three transferase genes there were no positives 
amongst the 21 pools of E. coll chromosomal DNA tested 
(Table 7) . In each case the #44 pool gave a positive 

30 result. 

B. Primers based on the E. coll 0111 O antigen gene 
cluster sequence. 

One to four pairs of primers for each of the 
35 transferase, wzx and wzy genes of Olll were tested 

against the pools 1 to 21 of coll chromosomal DNA 

(Table 8). For wbdH, four pairs of primers, which bind 
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to various regions of this gene, were tested and found to 
be specific for 0111 as there was no amplified DNA of the 
correct size in any of those 21 pools of E. coli 
chromosomal DNA tested. Three pairs of primers for whdM 
were tested, and they are all specific although primers 
#985/#986 produced a band of the wrong size from one 
pool. Three pairs of primers for wz:>c were tested and 
they all were specific. Two pairs of primers were tested 
for wzy, both are specific although #980/#983 gave a band 
of the wrong size in all pools. One pair of primers for 
whdL was tested and found unspecific and therefore no 
further test was carried out. Thus, wzx, wzy and two of 
the three transferase genes are highly specific to 0111. 

Bands of the wrong size found in amplified DNA are 
assumed to be due to chance hybridisation of genes widely 
present in E, coli. The primers, annealing temperatures 
and positions for each gene are in Table 8. 

The 0111 assay was also performed using pools 
including DNA from O antigen expressing Yersinia 
pseudotuberculosis, Shigella boydii and Salmonella, 

enterica strains (Table 8A) . None of the 

oligonucleotides derived from wbdH, wzx, wzy or wbdM gave 
amplified DNA of the correct size with these pools. 
Notably, pool number 25 includes S. enterica Adelaide 
which has the same O antigen as E. coli 0111: this pool 
did not give a positive PGR result for any primers tested 
indicating that these genes are highly specific for E. 
coli 0111. 

Each of the 12 pairs binding to wbdH, wz:k, wzy and 
WbdM produces a band of predicted size with the pools 
containing 0111 DNA (pools number 22 to 24) , As pools 22 
to 24 included DNA from all strains present in pool 21 
plus 0111 strain DNA (Table 5), we conclude that the 12 
pairs of primers all give a positive PGR test with each 
of three unrelated 0111 strains but not with any other 
strains tested. Thus these genes are highly specific for 
E. coli Olll. 
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15 



Primers based on the E. coli 0157 O antigen gene 
cluster sequence. 

Two or three primer pairs for each of the 
5 transferase, wz:k and wzy genes of 0157 were tested 

against E. coli chromosomal DNA of pools 1 to 19, 29 and 
30 (Table 9). For wbdN, three pairs of primers, which 
bind to various regions of this gene, were tested and 
found to be specific for 0157 as there was no amplified 
10 DNA in any of those 21 pools of E. coli chromosomal DNA 

tested. Three pairs of primers for whdO were tested, and 
they are all specific although primers # 1211/#1212 
produced two or three bands of the wrong size from all 
pools. Three pairs of primers were tested for whdP and 
they all were specific. Two pairs of primers were tested 
for whdR and they were all specific. For wzy, three 
pairs of primers were tested and all were specific 
although primer pair #1203/#1204 produced one or three 
bands of the wrong size in each pool. For wzx, two pairs 
of primers were tested and both were specific although 
primer pair #1217/#1218 produced 2 bands of wrong size in 
2 pools, and 1 band of wrong size in 7 pools. Bands of 
the wrong size found in amplified DNA are assumed to be 
due to chance hybridisation of genes widely present in E. 
25 coli. The primers, annealing temperatures and function 

information for each gene are in Table 9 . 

The 0157 assay was also performed using pools 37 to 
42, including DNA from O antigen expressing Yersinia 
pseudotuberculosis. Shigella boydii, Yersinia 

30 enterocolitica 09, Brucella abortus and Salmonella 

enterica strains (Table 9A) . None of the 

oligonucleotides derived from wbdN, wzy, wbdO, wz:>c, wbdP 
or wbdR reacted specifically with these pools, except 
that primer pair #1203/#1204 produced two bands with F. 
35 enterocolitica 09 and one of the bands is of the same 

size with that from the positive control. Primer pair 
#1203/ #1204 binds to wzy. The predicted secondary 
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structures of Wzy proteins are generally similar, 
although there is very low similarity at amino acid or 
DNA level among the sequenced wzy genes. Thus, it is 
possible that Y, enterocolitica 09 has a wzy gene closely 
5 related to that of E. coli 0157. It is also possible 

that this band is due to chance hybridization of another 
gene, as the other two wzy primer pairs (#1205/* 12 05 and 
#1207/#1208) did not produce any band with Y. 
enterocolitica. 09 . Notably, pool number 3 7 includes S. 

10 enterica. Landau which has the same O antigen as E. coli 

0157, and pool 38 and 3 9 contain DNA of B. abortus and Y. 
enterocolitica. 09 which cross react serologically with E. 
coli 0157 . This result indicates that these genes are 
highly 0157 specific, although one primer pair may have 

15 cross reacted with Y. enterocolitica. 09. 

Each of the 15 pairs binding to whdN, wzx, wzy, 
wbdO, wbdP and wbdR produces a band of predicted size 
with the pools containing 0157 DNA (pools number 31 to 
36) . As pool 29 included DNA from all strains present in 

20 pools 31 to 3 6 other than 0157 strain DNA (Table 6) , we 

conclude that the 15 pairs of primers all give a positive 
PGR test with each of the five unrelated 0157 strains. 

Thus PGR using primers based on genes wbdN, wzy, 
wbdO, wzx, wbdP and wbdR is highly specific for E. coli 

25 0157, giving positive results with each of six unrelated 

0157 strains while only one primer pair gave a band of 
the expected size with one of three strains with O 
antigens known to cross-react serologically with E, coli 
0157. 
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TABLE 1 

H7 strains used in this work in addition to the H 
antigens type strains 



Name used 


Serotype 


Original 


Sour 


in this 




name 




study 








M527 


0157 :H7 


C664-1992 


a 


M917 


018ac:H7 


A57 


IMVS 


M918 


018ac:H7 


A62 


IMVS 


M973 


02 :H7 


A1107 


CDC 


M1004 


0157 :H7 


EH7 


b 


M1179 


018ac:H7 


D-M3291/54 


IMVS 


M1200 


07 :H7 


A64 


c 


M1211 


019ab:H7 


F8188-41 


IMVS 


M1328 


053 :H7 


14097 


IMVS 


M1686 


055:H7 


TB156 


d 



10 



15 



20 



a . 
b. 



d. 

IMVS, 
CDC, 



Statens Serxim Institut, Copenhagen, Denmark. 

Dr R. Brown of Royal Children's Hospital, Melbourne, 
Australia . 

Max-Planck Institut fur molekulare Genetik, Berlin, 
Germany . 

Dr P. Tarr of Children's Hospital and Medical Center, 
University of Washington, USA. 

Institute of Medical and veterinary Science, Adelaide, 
Australia . 

Centers for Disease Control and prevention, Atlanta, 
USA. 
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Table 2 



Oligonucleotides used to PGR amplify f//C genes 
from different H type strains for sequencing 


H Type Strains 


Annealing Temperature (**C) 


Primers Used 


1 


55 


#1575/#1576 


2 


55 


#1285/#1286 


3 


55 


#1285/#1286 


4 


50 


Si 431 /#1 432 


5 


60 


#1285/#1286 


6 


55 


#1575/#1576 


7 


55 


#1575/#1576 

TT 1 ^ r w/ Tr f ^ f W 


8 


65 


#1431/#1432 


9 


60 


#1575/#1576 


10 


55 


#1575/#1576 

TT 1 * W/ TT 1 W i \J 


11 


55 


#1285/#1286 


12 


60 


#1575/#1576 


14 


60 


tr 1 \j t \jf TT 1 ^ / O 


15 


60 


#1S7'>/#1^7fi 

TT \ sj i wV TT 1 ^ / U 


16 


60 


#1^7^/#1'>7fi 

Tr J \J / sjl it \\j f \f 


17 


60 


#1417/#14.1 A 


18 


60 


TT 1 O / ^ITT \\Jt\J 


19 


60 


#1 ^^7^/^*1 «^7R 

TT 1 S / S/fr 1 O / D 


20 


60 


#i*;7*i/#i';7fi 

TT 1 «J r *J/ TT 1 >7 # O 


21 


55 


tr I^Ow/TT 1 £00 


23 


60 


#1 f^7^/Ji1 «;7fi 

TT l%jt Zj/rt 


24 


60 


TT I^O^/tt 1 ^OD 


25 


60 


TT I*TI f/Tr l*TlO 


26 


60 


#1575/#1576 

TT 1 ^/ f ^/ IT 1 ^/ f W 


27 


50 


#1431/#1432 


28 


60 


#1575/#1576 


29 


60 


#1285/#1286 


30 


60 


#1575/#1576 


31 


60 


#1575/#1576 

TT 1 %«# # W/ TT 1 ^ # W 


32 


60 


#1575/#1576 


33 


60 


#1285/1286 


34 


55 


#1575/#1576 


35 


50 


#1431/#1432 


37 


60 


#1285/#1286 


36 


60 


#1285/#1286 


39 


55 


#1285/#1286 


40 


55 


#1285/#1286 


41 


60 


#1575/^1576 

TT [ w f sJt TT 1 w / W 


42 


60 


#1285/#12af5 


43 


60 


#1575/#1576 


44 


60 


#1285/#1286 


45 


60 


#1575/#1576 


46 


60 


#1575/#1576 


47 


55 


#1285/#1286 


48 


60 


#1575/#1576 


49 


60 


#1575/#1576 


50 


60 


#1285/#1286 


51 


60 


#1575/#1576 


52 


60 


#1575/#1576 


54 


50 


#1431/#1432 


55 


60 


#1285/#1286 


56 


60 


#1285/#1286 
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Table 3 

Summary of the fiagellin sequences obtained and specific H type 
oligonucleotide primers 



H type strain(s) the 


H specificity 


H type strain from 


Positions of 


Positions of 


sequenced gene(s) 


coded by the 


which the fiagellin gene 


primer 1 


primer 2 


obtained from 


gene(s) 


sequence was used for 
primer choice 






1 


1 


1 


892-909 


1172-1189 


2 


2 


2 


568-587 


1039-1056 


4,17,44 


4 


4 


466-483 


628-648 


5 


5 


5 


697-714 


877-897 


6 


6 


6 


565-585 


799-816 


7 


7 


7 


553-570 
(primer #1806) 


1483-1500 
(primer #1809) 


9 


9 


9 


616-633 


838-855 


10(50)*** 


10 


10 


559-579 


697-717 


11 


11 


11 


586-606* 


791-810* 


12 


12 


12 


892-909 


1172-1189 


14 


14 


14 


586-606 


793-813 


15 


15 


15 


640-660 


817-834 


3 


16 


3 


649-666 


925-942 


18 


18 


18 


589-606 


802-819 


19 


19 


19 


607-624 


538-855 


20 


20 


20 


574-591 


760-780 


21,47 


21 


21 


676-693** 


862-879** 


23 


23 


23 


637-654 


1336-1353 


24 


24 


24 


496-516 


772-792 


26 


26 


26 


553-570 


772-789 


27 


27 


27 


685-702 


799-819 


28 


28 


28 


592-609 


778-798 


29 


29 


29 


538-555 


757-774 


30 


30 


30 


814-831 


943-962 


31 


31 


31 


571-588 


790-807 


32 


32 


32 


514-831 


1057-1074 


33 


33 


33 


553-570 


718-735 


34 


34 


34 


568-585 


796-816 


38,55 


38 


38 


553-573 


709-729 


39 


39 


39 


556-573 


718-735 


41 


41 


41 


598-615 


784-801 


42 


42 


42 


547-567 


715-735 


43 


43 


43 


580-597 


844-861 


45 


45 


45 


640-657 


943-963 


46 


46 


46 


565-582 


781-801 


49 


49 


49 


589-609 


754-771 


51 


51 


51 


565-582 


1042-1059 


52 


52 


52 


598-615 


829-846 


56 


56 


56 


697-714 


877-897 


8 and 40 




8 


562-579 


1045-1062 


25 




25 


529-549 


703-723 


35 




non-functional H1 1 gene 


769-789* 


1045-1065* 


37 




37 


520-537 


715-735 


48 




48 


568-585 


835-852 


54 




non-functional H21 gene 


988-1008** 


1344-1364** 



See section 13 for choice of primers for the fiagellin gene of H1 1 
See section 13 for choice of primers for the fiagellin gene of H21 
See text 
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Table 3A 

Cloning, expression and identification of flageliin genes 



H type strain 
from which 
the H antigen 
gene was 
cimpiiiieu 


Primers used 
for PGR 
amplification of 
the H antigen 
gene 


Annealing 
temperature 
(oC) used for 
PGR 

amplification 


Plasmid 
carrying the H 
antigen gene 


Host strain 
used for 
expression 


Antl-serum 
which reacts 
with an E. Coli 
fiiC deletion 
strain carrying 
the plasmid 


H antigen 
encoded by 
the cloned 
gene 


m 


tF1C500 OL fflo/U 


oo 


pKM iy<iU 


kJlOi OC 

IvI212d 


LJH 
HI 


HI 


wo 


fFloDo Ot fFlo/U 


oo 


nDD i r\~7*7 

prHiy/ / 


r55D0 


rl2 


H2 


no 


?F lOOO Ot *f lO/U 


oo 


nODi oca 


KoooO 


Hlo 


H16 




w lo/o ot fflooo 


OO 


prMiyoo 


Dcccrs 
roobO 


rl4 


H4 


no 


■U-iOdfl SL it-i07r\ 

ff looo <x fFlO/U 


OU 


prHiyD/ 


hilOi OC 

Ivi212d 


n5 


H5 


no 


ff 1 oOo ot w 1 0 / U 


OO 




Dcccrk 
r55b0 


Lie 

Ho 


H6 


n / 


•U'iQCQ J? J^i 0*7A 


oo 


prn1919 


P5560 


LIT 

H7 


H7 


My 


fFiobo est fflo/U 


oo 


prH1922 


P5560 


H9 


H9 


rii u 


4^i OCO O -Mi o~7rt 


55 


pPH1923 


P5560 


H10 


H10 


n 1 1 


■Mi OCO 0 -W-iOT/^ 

fflODO Ot # io70 


55 


pPHl981 


M2126 


H1 1 


H1 1 




#1868 & #1870 


60 


pPR1990 


M2126 


H12 


H12 


H14 


#1868 & #1870 


55 


pPR1924 


P5560 


H14 


H14 


n15 


#1868 & #1870 


55 


pPR1925 


P5560 


H15 


H15 


Hi/ 


#1878 & #1885 


65 


pPR1957 


P5560 


H4 


H4 


Ui O 

n 1 o 


#1868 & #1870 


55 


pPR1986 


M2126 


H18 


H18 


Ui 


#1868 & #1870 


55 


pPR1927 


P5560 


H19 


H19 


H20 


#1868 & #1870 


55 


pPR1963 


M2126 


H20 


H20 


H2 I 


#1868 & #1870 


55 


pPR1995 


M2126 


H21 


H21 


UIOO 


4(iOOO O -U-i 

# i8oo Ot #1869 


55 


pPHl942 


P5560 


H23 


H23 




# lOOO Ot WlOfU 


oo 


prH1971 


M2126 


H24 


H24 




•M-i 0£Q 0 -M-i Q'7r\ 
fflODO Ot #IO/U 


cc 
DO 


prpi1928 


P5560 


H26 


H26 


U07 


ff looo Ot fflo/U 


oo 


n D D i 0'70 

prH 1 y /O 


h jioi OC 


LIOT 

rl27 


H27 


n<eio 




oU 


pKM iy44 


roObU 


M2o 


H28 




tF I OOO CSt ff 1 0 / U 


oo 


P"ri ly /ii 


hiioi OC 


LJOO 


LIOO 

n29 


M^ir^ 


tF 1 OOO Ot TT lO/ 1 


oo 


prniy4o 


roObO 


Ljorv 
noO 


LIOO 

n30 


noi 


fflODO Ot lO/U 


cc 
DO 


nDD i OCC 

prH lybo 


hJIOi OC 

M21 2b 


n31 


H31 


no^ 


fflOOO Ol ff lO/ 1 


CC 

oo 


r^oo i o/ir\ 


roobO 


LIOO 


LIOO 

H32 


n*jv5 


ff 1 OOO Ot ff 1 O/ 1 


cc 
oo 


prri 1 y / D 


fcVIOI Oft 

vAd\ 2b 


rioo 


LIOO 

H33 


H34 


#1868 & #1870 


65 


DPR1930 


P5560 


H34 




H38 


#1868 & #1870 


48 


PPR1984 


M2126 


H38 


H38 


H39 


#1868 & #1870 


48 


PPR1982 


M2126 


H39 


H39 


H41 


#1868 & #1870 


65 


PPR1931 


P5560 


H41 


H41 


H42 


#1868 & #1870 


50 


PPR1979 


M2126 


H42 


H42 


H43 


#1868 & #1870 


65 


PPR1968 


M2126 


H43 


H43 


H45 


#1 868 & #1870 


60 


PPR1943 


P6560 


H45 


H45 


H46 


#1868 & #1870 


60 


PPR1966 


M2126 


H46 


H46 


H49 


#1868 & #1870 


60 


PPR1985 


M2126 


H49 


H49 


H51 


#1868 & #1870 


65 


PPR1941 


P5560 


H51 


H51 


H52 


#1868 & #1870 


65 


PPR1935 


P5560 


H52 


H52 


H56 


#1868 & #1870 


50 


PPR1978 


M2126 


H56 


H56 
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Table 3b Oligonucleotide primers used for PCR amplification and 
cloning of H antigen genes 

#1 868 5'- cat g cc atg gc a caa gtc att aat acc -3' 
Ncol 

#1 869 5'- ata tgt cga ct t aac cct gca gca gag aca g -3' 
Sail 

#1 870 5' - a tg gat cct taa ccc tgc age aga gac ag -3' 
Bamm 

#1871 5' - aactgcagt taa ccc tgt age aga gac ag -3' 
Pstl 

#1872 5' - eg g gat ccc gca gac tgg ttc ttg ttg at - 3' 
BamHl 

#1878 5' - eg g gat cc a ctt eta teg age gee tct ct - 3' 
Bamm 

#1884 5' - ge t eta gag egc aga tea ttc age agg cc -3' 
Xbal 

#1885 5' - ge t eta gac atg ttg gac act teg gtc gc - 3' 
Xbal 
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TABLE 4 



Pool 
No. 


Strains of which chromosonal DNA included in the pool 


Source* 


1 


£. coli type strains for O serotypes 1, 2, 3, 4, 10, 16, 18 and 39 


IMVS« 


2 


£. co/i type strairis for O serotypes 40, 41, 48, 49, 71, 73, 88 and 100 


IMVS 


3 


£. coli type strains for O serotypes 102, 109, 119, 120, 121, 125, 126 and 


IMVS 


A 
*± 


t. CO/ 2 type strains tor D serotypes 138, 139, 149, 7, 5, 6, 11 and 12 


IMVS 


c 
O 


E. co/z type strains for O serotypes 13, 14, 15, 17, 19ab, 20, 21 and 22 


IMVS 


c. 
O 


E. co/f type strains for O serotypes 23, 24, 25, 26, 27, 28, 29 and 30 


IMVS 


7 


L, COLI type strains tor (J serotypes 32, 33, 34, 35, 36, 37, 38 and 42 


IMVS 


o 
O 


h. co/z type strains tor O serotypes 43, 44, 45, 46, 50, 51, 52 and 53 


IMVS 


9 


E. coli type strains for O serotypes 54, 55, 56, 57, 58, 59, 60 and 61 


IMVS 


10 


E. coli type strains for O serotypes 62, 63, 64, 65, 66, 68, 69 and 70 


IMVS 


11 


E. coli tvpe strains for O serotvnes 74 7S 76 77 7R 7Q SO anH Rl 


11 VI V 


12 


E. co/z type strains for O serotypes 82, 83, 84, 85, 86, 87, 89 and 90 


IMVS 


13 


E. coli type strains for O serotypes 91, 92, 95, 96, 97, 98, 99 and 101 


IMVS 


14 


E. coli type strains for O serotypes 103, 104, 105, 106, 107, 108 and 110 


IMVS 


15 


E. coli type strains for O serotypes 112, 162, 113, 114, 115, 116, 117 and 
118 


IMVS 


16 


E, coli type strains for O serotypes 123, 165, 166, 167, 168, 169, 170 and 
171 


Seeb 


17 


E. coli type strains for O serotypes 172, 173, 127, 128, 129, 130, 131 and 
132 


See c 


18 


E, coli type strains for O serotypes 133, 134, 135, 136, 140, 141, 142 and 
143 


IMVS 


19 


£. coli type strains for O serotypes 144, 145, 146, 147, 148, 150, 151 and 
152 


IMVS 



a. Institute of Medical and Veterinary Science, Adelaide, Australia 

b. 123 from IMVS; the rest from Statens Serum Institut, Copenhagen, Denmark 

c. 172 and 173 from Statens Serum Institut, Copenhagen, Denmark, the rest from 
IMVS 
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TABLE 5 



Pool 
No. 


Strains of which chromosonal DNA included in the pool 


Source* 


20 


E. coli type strains for O serotypes 153, 154, 155, 156, 157, 158 , 159 and 
160 


IMVS 


21 


E. coli type strairis for O serotypes 161, 163, 164, 8, 9 and 124 


IMVS 


22 


As pool #21, plus £. coli 0111 type strain Stoke W. 


IMVS 


23 


As pool #21, plus E. coli 0111:H2 strain C1250-1991 


See d 


24 


As pool #21, plus E. coli 0111:H12 strain C156-1989 


See e 


25 


As pool #21, plus S. enterica serovar Adelaide 


Seef 


26 


Y. pseudotuberculosis strains of O groups LA, IIA, IIB, IIC, III, IV A, IVB, 
VA, VB, VI and VII 


Seeg 


27 


S. hoydii strains of serogroups 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 14 and 15 


See h 


28 


S. enterica strains of serovars (each representing a different O group) 
Typhi, Montevideo, Ferruch, Jangwani, Raus, Hvittingfoss, Waycross, 
Dan, Dugbe, Basel, 65,:i:e,n,z,15 and 52:d:e,n,x,zl5 


IMVS 



If 

d. C1250-1991 from Statens Serum Institut, Copenhagen, Denmark 

e. C156-1989 from Statens Serum Institut, Copenhagen, Denmark 

f. S, enterica serovar Adelaide from IMVS 

g. Dr S Aleksic of Institute of Hygiene, Germany 

h. Dr J Lefebvre of Bacterial Identification Section, Laboratoroie de Sante Publique 
du Quebec, Canada 
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TABLE 6 



PCT/AU99/00385 



Pool 
No. 


QfraiTic of whir' Vi pVimmriQ/^nji 1 F^MA inr'liiHfiH \t\ fl^o r^r\r\1 


Source* 


29 


E. coll type strains for O serotypes 153, 154, 155, 156, 158, 159 and 160 


IMVS 


30 


E. coli type strains for O serotypes 161, 163, 164, 8, 9, 111 and 124 


IMVS 


31 


As pool #29, plus E. coli 0157 type strain A2 (0157:H19) 


IMVS 


32 


As pool #29, plus E. coli 0157:H16 strain C475-89 


See d 


33 


As pool #29, plus E. coli 0157:H45 strain C727-89 


See d 


34 


As pool #29, plus E. co/z 0157:H2 strain C252-94 


See d 


35 


As pool #29, plus E. coli 0157:H39 strain C258-94 


See d 


36 


As pool #29, plus E. coli 0157:H26 


See e 


37 


As pool #29, plus S. enterica serovar Landau 


Seef 


38 
39 


As pool #29, plus Brucella abortus 
As pool #29, plus y. enterocolitica 09 


Seeg 
See h 


40 


y. pseudotuberculosis strains of O groups lA, IIA, IIB, IIC, III, IVA, IVB, VA, 
VB, VI and VII 


See i 


41 


S. boydii strains of serogroups 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 14 and 15 


See j 


42 


S. enterica strains of serovars (each representing a different O group) Typhi, 
Montevideo, Ferruch, Jangwani, Raus, Hvittingfoss, Waycross, Dan, Dugbe, 
Basel, 65:i:e,n,zl5 and 52:d:e,n,x,zl5 


IMVS 


43 


E. coli type strains for O serotypes 1,2,3,4,10,18 and 29 


IMVS 


44 


As pool #43, plus E. coli K-12 strains C600 and WGl 


IVMS 
See k 



d. 0157 strains from Statens Serum Institut, Copenhagen, Denmark 

e. 0157:H26 from Dr R Brown of Royal Children's Hospital, Melbourne, Victoria 

f. S. enterica serovar Landau from Dr M Poppoff of Institut Pasteur, Paris, France 

g. B. Abortus from the culture collection of The University of Sydney, Sydney, Australia 

h. y. enterocolitica 09 from Dr. K. Bettelheim of Victorian Infectious Diseases Reference 
Laboratory Victoria, Australia. 

i. Dr S Aleksic of Institute of Hygiene, Germany 

J. Dr J Lefebvre of Bacterial Identification Section, Laboratoroie de Sante Publique du 
Quebec, Canada 

k. Strains C600 and WGl from Dr. B.J. Backmann of Department of Biology, Yale 
University, USA. 
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CLAIMS : 



10 



15 



20 



25 



30 



1. A nucleic acid molecule encoding all or part of an 
E. coli flagellin protein, provided that the nucleic acid 
molecule does not encode a protein expressed by the E. 
coli HI, H7, H12 or H48 type strains. 

2 . A nucleic acid molecule according to claim 1 wherein 
the molecule is derived from a fliC gene. 

3 . A nucleic acid molecule including all or part of a 
sequence according to any one of SEQ ID NOs : 1 to 68. 

4. A nucleic acid molecule consisting of all or part of 
a sequence according to any one of SEQ ID NOs: 1 to 68. 

5 . A nucleic acid molecule according to any one of 
claims 1-4 wherein the molecule is from about 10 to 2 0 
nucleotides in length. 

6 . A nucleic acid molecule according to claim 5 wherein 
the molecule is capable of hybridising to the central 
region of a flagellin gene from which the molecule is 
derived. 

7 . A nucleic acid molecule selected from the group of 
nucleic acid molecules shown in Table 3 . 

8. A method of detecting the presence of E. coli of a 
particular H serotype in a sample, the method comprising 
the step of specifically hybridising at least one nucleic 
acid molecule derived from a flagellin gene, wherein the 
at least one nucleic acid molecule is specific for a 
particular flagellin gene associated with the H serotype, 
to any E. coli in the sample which contain the gene, and 
detecting any specifically hybridised nucleic acid 
molecules, wherein the presence of specifically 
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hybridised nucleic acid molecules identifies the presence 
of the H serotype in the sample. 

9 . A method according to claim 8 wherein the at least 
one nucleic acid molecule is according to any one of 
claims 1 to 7 . 

10. A method according to claim 8 wherein the 
specifically hybridised nucleic acid molecules are 
detected by Southern Blot analysis. 

11. A method of detecting the presence of E. coli of a 
particular H serotype in a sample, the method comprising 
the step of specifically hybridising at least one pair of 
nucleic acid molecules to any JE:. coli in the sample which 
contains the flagellin gene for the particular H 
serotype, wherein at least one of the nucleic acid 
molecules is specific for the particular flagellin gene 
associated with the H serotype, and detecting any 
specifically hybridised nucleic acid molecules, wherein 
the presence of specifically hybridised nucleic acid 
molecules identifies the presence of the H serotype in 
the sample . 

12 . A method according to claim 11 wherein the at least 
one pair of nucleic acid molecules is according to any 
one of claims 1 to 7 . 

13 . A method according to claim 11 wherein the 
specifically hybridised nucleic acid molecules are 
detected by the polymerase chain reaction. 

14. A method for detecting the presence of a particular 
O serotype and H serotype of E. coli in a sample, the 
method comprising the following steps: 

(a) specifically hybridising at least one nucleic 
acid molecule derived from and specific for a gene 
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encoding a transferase or a gene encoding an enzyme for 
the transport or processing of a polysaccharide or 
oligosaccharide unit, the gene being involved in the 
synthesis of a particular E. coli O antigen, to any E, 
coli in the sample which contain the gene; 

(b) specifically hybridising at least one nucleic 
acid molecule derived from and specific for a particular 
flagellin gene associated with that H serotype, to any E. 
coli in the sample which contain the gene; and 

(c) detecting any specifically hybridised nucleic 
acid molecules, wherein the presence of specifically 
hybridised nucleic acid molecules identifies the presence 
of the particular H serotype and O serotype of E. coli in 
the sample. 

15. A method according to claim 14 wherein the at least 
one nucleic acid molecule of step (a) is selected from 
the group consisting of: 

wbdH (nucleotide position 739 to 1932 of Figure 5), 
wzx (nucleotide position 8646 to 9911 of Figure 5) , 
wzy (nucleotide position 9901 to 10953 of Figure 5) , 
whcM (nucleotide position 11821 to 12945 of Figure 5), 
wbdN (nucleotide position 79 to 861 of Figure 6) , 
wbdO (nucleotide position 2011 to 2757 of Figure 6) , 
wbdP (nucleotide position 5257 to 6471 of Figure 6) , 
wbdR (nucleotide position 13156 to 13821 of Figure 6), 
wzx (nucleotide position 2744 to 4135 of Figure 6) and 
wzy (nucleotide position 858 to 2042 of Figure 6) , 

16. A method according to claim 14 wherein the at least 
one nucleic acid molecule of step (a) is selected from 
the group of nucleic acid molecules shown in Tables 8, 
8A, 9 and 9A. 

17 . A method according to claim 14 wherein the at least 
one nucleic acid molecule of step (b) is according to any 
one of claims 1 to 7 . 
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18. A method according to claim 14 wherein the 
specifically hybridised nucleic acid molecules are 
detected by Southern Blot analysis* 

19 . A method for detecting the presence of a particular 
O serotype and H serotype of coli in a sample, the 
method comprising the following steps: 

(a) specifically hybridising at least one pair of 
nucleic acid molecules derived from and specific for a 
gene encoding a transferase or a gene encoding an enzyme 
for the transport or processing of a polysaccharide or 
oligosaccharide unit, the gene being involved in the 
synthesis of a particular E. coli O antigen, to any E. 
coli in the sample which contain the gene; 

(b) specifically hybridising at least one pair of 
nucleic acid molecules derived from and specific for a 
particular flagellin gene associated with that H 
serotype, to any E. coli in the sample which contain the 
gene ; and 

(c) detecting any specifically hybridised nucleic 
acid molecules, wherein the presence of specifically 
hybridised nucleic acid molecules identifies the presence 
of the particular H serotype and O serotype of E. coli in 
the sample . 

20. A method according to claim 19 wherein the at least 
one pair of nucleic acid molecules of step (a) is selected 
from the group consisting of : 

wbdH (nucleotide position 739 to 1932 of Figure 5) , 
wzx (nucleotide position 8646 to 9911 of Figure 5), 
wzy (nucleotide position 9901 to 10953 of Figure 5), 
whdM (nucleotide position 11821 to 12945 of Figure 5), 
wbdN (nucleotide position 79 to 861 of Figure 6), 
whdO (nucleotide position 2011 to 2757 of Figure 6) , 
wbdP (nucleotide position 5257 to 6471 of Figure 6) , 
wbdR (nucleotide position 13156 to 13821 of Figure 6) , 
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wz:x: (nucleotide position 2744 to 413 5 of Figure 6) and 
wzy (nucleotide position 858 to 2042 of Figure 6) . 

21. A method according to claim 19 wherein the at least 
one pair of nucleic acid molecules of step (a) is 
selected from the group of nucleic acid molecules shown 
in Tables 8, 8A, 9 and 9A. 

22 . A method according to claim 19 wherein the at least 
one nucleic acid molecule of step (b) is according to any 
one of claims 1 to 7 . 

23 . A method according to claim 19 wherein the 
specifically hybridised nucleic acid molecules are 
detected by the polymerase chain reaction. 

24. A method for detecting the presence of a particular 
O serotype and H serotype of E. coli in a sample, the 
method comprising the following steps: 

(a) specifically hybridising at least one nucleic 
acid molecule derived from and specific for a gene 
encoding a flagellin associated with a particular E. coli 
H antigen serotype, to any E. coli carrying the gene and 
present in the sample; 

and 

(b) detecting the at least one specifically 
hybridised nucleic acid molecule, wherein the at least one 
nucleic acid molecule is specific for the particular 
combination of O and H antigen. 

25. A method according to claim 24 wherein the at least 
nucleic acid molecule is according to any one of SEQ ID 
NOS: 9, 55, 57 to 65. 

26. A method for testing a food derived sample for the 
presence of one or more particular E, coli O antigens and 
H antigens, wherein the particular E. coli O and H 
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antigens in the food derived sample are detected using the 
method of any one of claims 8, 11, 14 or 19 . 

27. A method for testing a faecal derived sample for the 
5 presence of one or more particular E. coli O antigens and 

H antigens wherein the particular E. coli O and H antigens 
in the faecal derived sample are detected using the method 
of any one of claims 8, 11, 14 or 19 . 



10 28. A method for testing a patient or animal derived 

sample for the presence of one or more particular E. coli 
O antigens and H antigens wherein the particular E. coli O 
and H antigens in the patient or animal derived sample are 
detected using the method of any one of claims 8, 11, 14 

15 or 19. 



29- A kit for identifying the H serotype of E. coli, the 
kit comprising at least one nucleic acid molecule 
according to any one of claims 1 to 7 . 

20 

30. A kit for identifying the H and O serotype of E, 
coli, the kit comprising: 

(a) at least one nucleic acid molecule derived from and 
specific for an E. coli flagellin gene; and 

25 (b) at least one nucleic acid molecule derived from and 

specific for a gene encoding a transferase or a gene 
encoding an enzyme for the transport or processing of a 
polysaccharide or oligosaccharide unit, the gene being 
involved in the synthesis of a particular E. coli O 

30 antigen. 



31. A kit according to claim 30 wherein the at least one 
nucleic acid molecule of (a) is selected from the group 
consisting of: 
35 wbdH (nucleotide position 739 to 1932 of Figure 5) , 

wz^ (nucleotide position 8646 to 9911 of Figure 5) , 
wzy (nucleotide position 9901 to 10953 of Figure 5) , 
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10 



wbc3M 


(nucleotide position 11821 


to 


12945 of 


Figure 


f t 


whdN 


(nucleotide position 79 


to 


861 of 


Figure 


o) / 


wbdO 


(nucleotide position 2011 


to 


2757 of 


Figure 


6) , 


wbdP 


(nucleotide position 5257 


to 


6471 of 


Figure 


6) , 


wbdR 


(nucleotide position 13156 


to 


13821 of 


Figure 


6) , 


wzx 


(nucleotide position 2744 to 4135 of Figure 6) 


and 


wzy 


(nucleotide position 858 to 2042 


of Figure 


6) . 





32. A kit according to claim 30 wherein the at least one 
nucleic acid molecule of (a) is selected from the group 
of nucleic acid molecules shown in Tables 8, 8A, 9 and 
9A. 

33. A kit according to claim 30 wherein the at least one 
nucleic acid molecule of (b) is according to any one of 
claims 1 to 7 . 
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GATCTGATGGCCGTAGGGCGCTACGTGCTTTCTGCTGATATCTGGGCTGAGTTGGAAAAA 60 

ACTGCTCCAGGTGCCTGGGGACGTATTCAACTGACTGATGCTATTGCAGAGTTGGCTAAA 120 

AAACAGTCTGTTGATGCCATGCTGATGACCGGCGACAGCTACGACTGCGGTAAGAAGATG 180 

GGCTATATGCAGGCATTCGTTAAGTATGGGCTGCGCAACCTTAAAGAAGGGGCGAAGTTC 240 

CGTAAGAGCATCAAGAAGCTACTGAGTGAGTAGAGATTTACACGTCTTTGTGACGATAAG 3 00 

CCAGAAAAAATAGCGGCAGTTAACATCC AGGCTTCTATGCTTTAAGCAATGGAATGTTAC 3 60 

TGCCGTTTTTTATGAAAAATGACCAATAATAACAAGTTAACCTACCAAGTTTAATCTGCT 420 

TTTTGTTGGATTTTTTCTTGTTTCTGGTCGCATTTGGTAAGACAATTAGCGTGAGTTTTA 480 

GAGAGTTTTGCGGGATCTCGCGGAACTGCTCACATCTTTGGCATTTAGTTAGTGCACTGG 540 

TAGCTGTTAAGCCAGGGGCGGTAGCTTGCCTAATTAATTTTTAACGTATACATTTATTCT 600 



TGCCGCTTATAGCAAATAAAGTCAATCGGATTAAACTTCTTTTCCATTAGGTAAAAGAGT 660 



GTTTGTAGTCGCTCAGGGAAATTGGTTTTGGTAGTAGTACTTTTCAAATTATCCATTTTC 72 0 



Start of oz-fl 

MLLCCIHINVYYLL 
CGATTTAGATGGCAGTTG ATGTTACTATGCTGCATACATATCAATGTATATTATTTACTT 7 80 



LECDMKKIVI IGNVASMMLR 
TTAGAATGTGATATGAAAAAAATAGTGATCATAGGCAATGTAGCGTCAATGATGTTAAGG 840 

FRKELIMNLVRQGDNVYCLA 
TTCAGGAAAGAATTAATCATGAATTTAGTG AGGCAAGGTGATAATGTATATTGTCTAGC A 900 

NDFSTEDLKVLS SWGVKGVK 
AATGATTTTTCCACTGAAGATCTTAAAGTACTTTCGTCATGGGGCGTTAAGGGGGTTAAA 960 

FSLNSKGXNPFKDI lAVYEL 
TTCTCTCTTAACTCAAAGGGTATTAATCCTTTTAAGGATATAATTGCTGTTTATGAACTA 102 0 

KKILKDISPDIVFSYFVKPV 
AAAAAAATTCTTAAGGATATTTCCCCAGATATTGTATTTTCATATTTTGTAAAGCCAGTA 1080 

IFGTIASKLSKVPRIVGMIE 
ATATTTGGAACTATTGCTTCAAAGTTGTCAAAAGTGCCAAGGATTGTTGGAATG ATTGAA 1140 

GLGNAFTYYKGKQTTKTKMI 
GGTCTAGGTAATGCCTTCACTTATTATAAGGGAAAGCAGACCACAAAAACTAAAATGATA 12 00 

KWIQILLYKLALPMLDDLIL 
AAGTGGATACAAATTCTTTTATATAAGTTAGCATTACCGATGCTTGATGATTTGATTCTA 12 60 

LNHDDKKDLIDQYNIKAKVT 
TTAAATC ATGATGATAAAAAAGATTTAATCGATCAGTATAATATTAAAGCTAAGGTAACA 13 20 

VLGGIGLDLNEFSYKEP PKE 
GTGTTAGGTGGGATTGGATTGGATCTTAATGAGTTTTCATATAAAGAGCCACCGAAAGAG 13 80 

KITFXFIARLLREKGIFEFI 
AAAATTACCTTTATTTTTATAGCAAGGTTATTAAGAGAGAAAGGGATATTTGAGTTTATT 1440 

EAAKFVKTTY PS SEFV I LGG 
GAAGCCGCAAAGTTCGTTAAGACAACTTATCCAAGTTCTGAATTTGTAATTTTAGGAGGT 15 0 0 
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FESNNPFSLQKNEIESLRKE 
TTTGAGAGTAATAATCCTTTCTC ATTACAAAAAAATGAAATTGAATCGCTAAGAAAAGAA 1560 

HDLIYPGHVENVQDWLEKS S 
CATGATCTTATTTATCCTGGTCATGTGG AAAATCTTCAAGATTGGTTAGAGAAAAGTTCT 162 0 

VFVLPTSYREGVPRVIQEAM 
GTTTTTGTTTTACCTACATCATATCGAGAAGGCGTACCAAGGGTGATCCAAGAAGCTATG 1680 

AIGRPVITTNVPGCRDI IND 
GCTATTGGTAGACCTGTAATAACAACTAATGTACCTGGGTGTAGGGATATAATAAATGAT 1740 

GVNGFLIPPFEINLLAEKMK 
GGGGTCAATGGCTTTTTGATACCTCCATTTGAAATTAATTTACTGGCAGAAAAAATGAAA 1800 

YFI ENKDKVLEMGLAGRKFA 
TATTTTATTGAGAATAAAGATAAAGTACTCG AAATGGGGCTTGCTGGAAGGAAGTTTGC A 1860 

EKNFDAFEKNNRLAS I IKSN 
GAAAAAAACTTTGATGCTTTTGAAAAAAATAATAGACTAGCATCAATAATAAAATCAAAT 1920 



Bnd o£ 03r£l 

N D F * 

AATGATTTT TGACTTGAGCAGAAATTATTTATATTTCAATCTGAAAAATAAAGGCTGTTA 1980 
St:art: of 02rf2 

MNKVALITG ITGQDGSYLA 
TTA^AATAAAGTGGCATTAATTACTGGTATC ACTGGGCAAGATGGCTCCTATTTGGC AG 204 0 

ELLLEKGYEVHGIKRRASSF 
AATTATTGTTAGAAAAAGGTTATGAAGTTCATGGTATTAAACGCCGTGCATCTTCATTTA 2100 

NTERVDHIYQDSHLANPKLF 
ATACTGAGCGAGTGGATCACATCTATCAGGATTC ACATTTAGCTAATCCTAAACTTTTTC 2160 

LHYGDLTDTSNLTRILKEVQ 
TACACTATGGCGATTTGACAGATACTTCCAATCTGACCCGTATTTTAAAAGAAGTTCAAC 2220 

PDEVYNLGAMSHVAVSFESP 
CAGATGAAGTTTACAATTTGGGGGCGATGAGCCATGTAGCGGTATCATTTGAGTCACCAG 2280 

EYTADVDAIGTLRL LEAIRI 
AATACACTGCTGATGTTGATGCGATAGGAACATTGCGTCTTCTTGAAGCTATCAGGATAT 2340 

LGLEKKTKFYQASTSELYGL 
TGGGGCTGGAAAAAAAGACAAAATTTTATCAGGCTTCAACTTCAGAGCTTTATGGTTTGG 2400 

VQEI PQKETTPFYPRSPYAV 
TTCAAGAAATTCCACAAAAAGAGACTACGCCATTTTATCCACGTTCGCCTTATGCTGTTG 2 4 60 

AKLYAYWITVNYRESYGMFA 
CAAAATTATATGCCTATTGGATC ACTGTTAATTATCGTGAGTCTTATGGTATGTTTGCCT 252 0 

CNGILFNHESPRRGETFVTR 
GCAATGGTATTCTCTTTAACCACGAATCACCTCGCCGTGGCGAGACCTTTGTTACTCGTA 2580 

KITRGIANIAQGLDKCLYLG 

AAATAAC ACGCGGG ATAGC AAATATTGCTC AAGGTCTTGATAAATGCTTATACTTGGGAA 2 640 

NMDSLRDWGHAKDYVKMQWM 

ATATGG ATTCTCTGCGTGATTGGGGACATGCTAAGGATTATGTCAAAATGCAATGGATGA 2700 
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MLQQETPEDFVIATGIQYSV 
TGCTGCAGCAAGAAACTCCAGAAGATTTTGTAATTGCTACAGGAATTCAATATTCTGTCC 2760 

REFVTMAAEQVGIELAFEGE 
GTGAGTTTGTC ACAATGGCGGC AGAGCAAGTAGGC ATAGAGTTAGCATTTG AAGGTGAGG 2820 

GVNEKGVVVSVNGTDAKAVN 
GAGTAAATGAAAAAGGTGTTGTTGTTTCGGTCAATGGCACTGATGCTAAAGCTGTAAACC 2 880 

PGDVI ISVDPRYFRPAEVET 
CGGGCGATGTAATTATATCTGTAGATCCAAGGTATTTTAGGCCTGCAGAAGTTGAAACCT 2 940 

LLGDPTNAHKKLGWSPEITL 
TGCTTGGCG ATCCTACTAATGCGC ATAAAAAATT AGGATGGAGCCCTGAAATTAC ATTGC 3000 

REMVKEMVSSDLAIAKKNVL 
GTGAAATGGTAAAAGAAATG aTTTCCAGGGATTTAGCAATAGCGAAAAAGAACGTCTTGC 3 060 

End o£ or £2 

LKANNIATNIPQE* 

TGAAAGCTAATAACATTGCCACTAATATTCCGCAAGAAr.\AAAAAGATAATACATTAAAT 3120 

Start of orf3 

M F 

/ ^ ATTAAAAATGGTGCTAGATTTA'TTAGTACCATTATTTTTTTTTGGGTGACTAATGTT'rA 3180 

ITSDKFREI IKLVPLVSIDL 
TTACATCAGATAAATTTAGAGAAj'iTTATCAAGTTAGTTCCATTAGTATCAATTGATCTGC 3240 

LIENENGEYLFGLRNNRPAK 
TAATTGAAAACGAGAATGGTGAA - TATTTATTTGGTCTTAGGAATAATCGACCGGCCAAAA 3 3 00 

NYFFVPGGRIRKNESIKNAF 
ATTATTTTTTTGTTCCAGGTGGTAGGATTCGCAAAAATGAATCTATTAAJ ' u ^ iATGCTTTTA 3 360 

KRISSMELGKEYGISGSVFN 
AAAGAATATCATCTATGGAATTAGGTAAAGAGTATGGTATTTCAGGAAGTGTTTTTAATG 3420 

GVWEHFYDDGFFSEGEATHY 
GTGTATGGGAACATTTCTATGATGATGGTTTTTTTTCTGAAGGCGAGGCAj>iCACATTATA 3480 

IVLCYTLKVLKSELNLPD DQ 
?AGTGCTTTGTTACACACTGAAAGTTCTTAAAAGTGAATTCAATCTCCCAGATGATCAAC 3 540 

HREYLWLTKHQINAKQDVHN 
ATCGTGAATACCTTTGGCTAACTAAACACCAAATAAATGCTAAACAAGATGTTCATAACT 3 600 

End of orf 3 Start of orf4 

YSKNYFL* M 
ATTCAAAAAATTATTTTTTGiTAATTTTTATTAAAAATT/ ' ATATGCGAGAGAA'rTGT ATG T 3 660 

SQCLYPVI lAGGTGSRLWPL 
CTCAATGTCTTTACCCTGTAATTATTGCCGGAGGAACCGGAAGCCGTGTATGGCCGTTGT 3720 

SRVLYPKQFLNLVGDSTMLQ 
CTCGAGTATOATACCCTAAACAATTTTTAAATTTAGTTGGGGATTCTACAATGTTGCAAA 378 0 

TTITRLDGIECENPIVICNE 
CAACAATTACGCGTTTGGATGGCATCGAATGCGAAAATCCAATTGTTATCTGCAATGAAG 3 840 

DHRFIVAEQLRQIGKLTKNI 
ATCACCGATTTATTGTAGCAGAOCAATTACGACAGATTG G TAAG C TAACCAAGAATATTA 3 900 

ILEPKGRNTAP AIALAAFIA 
T ACTTGAGCCGAAAGGCCGTAATACTGCACCTGCCATAGCOTTAGCTGCTTTTATCGCTC 3 9 60 
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QKNNPNDDPLLLVLAADHSI 
AGAAGAAT/iATCCTAATOACGACCCTTTATTATTAGTACTTGCGGCAGACCACTCTATAA 



4020 



NNEKAFRESI IKAMPYATSG 
ATAATGAAAAAGCATTTCGAGAGTCAATAAT/ ^ iA/LAGCTATGCCGTATGCA/LCTTCTGGGA 4 080 

KLVTFGI IPDTANTGYGYIK 
AGTTAGTAACATTTGGAATTATTCCGGACACGGCAAATACTGGTTATGGATATATTAAGA 414 0 

RSSSADPNKEFPAYNVAEFV 
C/AGTTCTTCAGCTGATCCTAATAAAGAATTCCCAGCATATAATGTTCCGGAGTTTGTAG 4200 

EKPDVKTAQEYI S SGNYYWN 
A/LAAACCAGATGTTA?iAACAGCACAGGAATATATTTCGAGTGGGAATTATTACTGGAATA 4260 

SGMFLFRASKYLDELRKFRP 
GCGGAATGTTTTTATTTCGCGCCAGT/iAATATCTTGATGAACTACGGA/ATTTAGACCAG 43 20 

DIYHSCECATATANIDMDFV 
ATATTTATCATAGCTGTGAATGTGCAACCGCTACAGCAAATATAGATATGGACTTTGTCC 4380 

RINEAEFINCPE ESIDYAVM 
GAATTAACGAGGCTGAGTTTATTAATTGTCCTGAAGAGTCTATCGATTATGCTGTGATGG 4440 

EKTKDAVVLPIDIGWNDVGS 
AAAAAACAAAAGACCCTGTAGTTCTTCCGATAGATATTCCCTGGAATGACGTGGGTTCT'P - 4500 

WSSLWDISQKDCHGNVCHGD 
GGTCATCACTTTGGGATATAAGCCAT^GGATTGCCATGGTAATGTGTGCCATGGGGATG 4560 

VLNHDGENSFIYSES SLVAT 
TGCTGAATCATGATGGAGAAAATAGTTTTATTTACTCTGAGTCAPiGTCTCGTTGCGACAG 4620 

VGVSNLVIVQTKDAVLVADR 
TCGGAGTAAGTAATTTAGTAATTGTCCAAACCAAGGATGCTGTACTGGTTGCGGACCGTG 4680 

DKVQNVKNIVDDLKKRKRAE 
ATAAAGTCCAAAATGTTT V iAAAACATAGTTGACGATCTAAAAAAGAGAAA/iCGTGCTGAAT 4740 

YYMHRAVFRPWGKFDAIDQG 
ACTACATGCATCGTGCAGTTTTTCGCCCTTGGGGT/AATTCGATCCAATACACCAAGGCG 4800 

DRYRVKKIIVKPGEGLDLRM 
ATAGATATAGAGTAAAAAAAATAATAGTTAAACCAGGAGAAGGGTTAGATTTAAGGATGC 4860 

HHHRAEHWIVVSGTAKVSLG 
ATCATCATACGGCAGAGCATTGGATTGTTGTATCCGGTACTGCTAA^iGTTTCACTAGGTA 4920 

SEVKLLVSNESIYIPQGAKY 
CTGAAGTTAAACTATTAGTTTCTAATGAGTCTATATATATCCCTCAGGGAGCAAAATATA 49 80 

SLENPGVIPLHLIEVS SGDY 
GTCTTGAGAATCCAGGCGTAATACCTTTGCATCTAATTGAAGTAAGTTCTGGTGATTACG 5040 

LESDDIVRFTDRYNSKQFLK 
TTGi ^ iATCAGATGATATAGTGCGTTTTACTGACAGATATAACAGTAAACAT ^ iTTCCTAAAGC 5100 



of orf4 Start of orfS 

MNKITCFKAYDIRGRL 

R D * 

G AGATTG^^TAAATATGAATAAAATAACTTGCTTCAAAGCATATGATATACGTCGGCGTCT 
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GAELNDEIAYRIGRAYGEF F 
TGGTGCTGAATTGAATGATGAAATAGCATATAGAATTGGTCGCGCTTATGGTGAGTTTTT 522 0 

KPQTVVVGGDARLTSESLKK 
TAAACCTCAAACTGTAGTTGTGGGAGGAGATGCTCGCTTAACAAGTGAGAGTTTAAAGAA 5280 

SL-*SNGLCDAGVNVLDLGMCG 
ATCACTCTCAAATGGGCTATGTGATGCAGGCGTAAATGTCTTAGATCTTGGAATGTGTGG 53 4 0 

TEEIYFSTWYLGIDGGIEVT 
TACTGAAGAGATATATTTTTCCACTTGGTATTTAGGAATTGATGGTGGAATCGAGGTAAC 54 00 

ASHNPIDYNGMKLVTKGARP 
TGCAAGCCATAATCCAATTGATTATAATGGAATGAAATTAGTAACCAAAGGTGCTCGAGC 5460 

ISSDTGLKDIQQLVESNNFE 
AATCAGCAGTGACACAGGTCTCAAAGATATACAACAA'FPAGTAGAGAGTAATAATTTTGA 5520 

ELNLEKKGNITKYSTRDAYI 
AGAGCTCAACCTAGAAAAAAAAGGGAATATTACCAAATATTCCACCCGAGATGCCTAGAT 5580 

NHLMGYANLQKIKKIKIVVN 
AAATCATTTGATGGGCTATGCTAATCTGCAAAAAATAAAAAAAA'TCAAAATAG'rTGTGAA 5640 

SGNGAAGPVIDAIEECFLRN 
TTCTGGGAATGGTGCAGCTGGTCCTGTTATTGATGCTATTGAGGAATGCTTTTTACGGAA 5700 

NIPIQFVKINNTPDGNFPHG 
CAATATTCCGATTCAGTTTGTAAAAATAAATAATACACCCGATGGTAATTTTCCACATGG 5760 

IPNPLLPECREDTSSAVIRH 
TATCCCTAATCCATTACTACCTGAGTGCAGAGAAGATACCAGCAGTGCGGTTATAAGACA 5820 

SADFGIAFDGDFDRCF FFDE 
TAGTGCTGATTTTGGTATTGGATTTGATGGTGATTTTGATAGGTGTTTTTTCTTTGATGA 5880 

NGQFIEGYYIVGLLAEVFLG 
AAATGGACAATTTATTGAAGGATACTACATTGTTGGTTTATTAGCGGAAGTTTTTTTAGG 5940 

KYPNAKI IHDPRLIWNTIDI 
GAAATATCCAAACGCAAAAATCATTCATGATCCTCGCCTTATATGGAATACTATTGATAT 6000 

VESHGGI PIMTKTGHAYIKQ 
CGTAGAAAGTCATCGTGGTATACCTATAATGACTAAAACCGGTCATGCTTACAT'I'AAGCA 6060 

RMREEDAVYGGEMSAHHYFK 
AAGAATGCGTGAAGAGGATGCCGTATATGGCGGCGAAATGAGTGCGGATCATTATTTTAA 6120 

DFAYCDSGMI PWILICEL LS 
AGATTTTGCATACTGCGATAGTGGAATGATTCCTTGGATTTTAATTTGTGAACTTTTGAG 6180 

LTNKKLGELVCGCINDWPAS 
TCTGACAAATAAAAAATTAGGTGAACTGGTTTGTGGTTGTATAAACGACTGGCCGGCAAG 6240 

GEINCTLDNPQNEIDKLFNR 
TGGAGAAATAAACTGTACACTAGACAATCCGCAAAATGAAATAGATAAATTATTTAATCG 63 00 

YKDSALAVDYTDGLTMEFSD 
TTACAAAGATAGTGCCTTAGCTGTTGATTACACTGATGGATTAACTATGGAGTTCTCTGA 63 60 

WRFNVRCSNTEPVVRLNVES 
TTGGCGTTTTAATGTTAGATGCTCAAATACAGAACCTGTAGTACGATTGAATGTAGAATC 64 2 0 

RNNAILMQEKTEEILNFISK 
TAGGAATAATGCTATTCTTATOCAGGAAAAAACAGAAGAAATTCTGAATTTTATAT C AA^ 64 80 
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End of orfS Start of orf6 

MKVLLTG 
ATAA^TTTGCACCTGAGTTCATAATGGGAA C AA G AAATATATGAAAGTACTTCTGACTGG 6540 

STGMVGKNILEHD-:SASKYNI 
CTCAACTG G CATGGTTGGTAAGAATATATTAGAGCATGATAGTGCAAGTA/i/LTATAATAT 6600 

LTPTS SDLNLLDKNEIEKFM 
ACTTACTCCi ^ i/iCCAGCTCTGATTTGAATTTATTAGATAAAAATGAAATAGAAAAAOTCAT 6660 

LINMPDCI IHAAGDVGGIHA 
GCTTATCAACATGCCAGACTGTATTATACATGCAGCGGGATTAGTTGGAGGCATTCATGC 6720 

NISRPFDFLEKNLQMGLNLV 
AAATATAAGCAGGCCGTTTGATTTTCTGGAAAAAAATTTGCAGATGGGTTTAAATTTAGT 67 80 

SVAKKLGIKKVLNLGSSCMY 
TTCCGTCGCAAAAAAACTAGGTATCAAGAAAGTGCTTAACTTGGGTAGTTCATGCATGTA 6840 

PKNFEEAIPEKALLTGELEE 
CCCCAAAAACTTTGAAGAGGCTATTCCTGAGAAAGCTCTGTTAACTOGTGAGCTAGAAGA 6900 

TNEGYAIAKIAVAKACEYIS 
AACTAATCAGGGATATGCTATTGCGAAAATTGCTGTAGCAAAAGCATGCGAATATATATC 6960 

RENSNYFYKTI IPCNLYGKY 
AAGAGAAAACTCTAATTAT'TTTTATAAAACAATTATCCCATG'rAATa'a'ATATGGGAAATA 7020 

DKFDDNSSHMIPAVIKKIHH 
TGATAAATTTGATGATAACTCGTCACATATGATTCCGGCAGTTATAAAAAAAATCCATCA 7 080 

AKINNVPEIEIWGDGNSRRE 
TGCGAAAATTAATAATGTCCCAGAGATCGAAATTTGGGGGGATGGTAATTCGCGCCGTGA 7140 

FMYAEDLADLIFYVIPKIEF 
GTTTATGTATGCAGAAGATTTAGCTGATCTTATTTTTTATGTTATTCCTAAAATAGAATT 7200 

MPNMVNAGLGYDYS INDY YK 
CATGCCT/iATATGGTAAATGCTGGTTT'AGGTTACGATTATTCAATTAATGACTATTATAA 7 2 60 

IIAEEIGYTGSFSHDLTKPT 
GATAATTGCAGAAGAAATTGGTTATACTGGGAGTTTTTCTCATGATTTAACi " iAAACCAAC 7 3 20 

GMKRKLVDISLLNKIGWSSH 
AGGAATGAT ' iACGGAAGCTAGTAGATATTTCATTGCTTAATAAAATTGGTOGGTCAAGTCA 73 80 

FELRDGIRKTYNYYLENQNK 
CTTTGAACTCAGAGATGGCATCAGAAAGACCTATAATTATTACTTGGAGAATCAAAATAA 7440 



Start of orf7. End of orf6 

MITYPLASNTWDEYEYAAIQ 
* 

ArGATTACATACCCACTTGCTAG'TAATACTTGGGATGAATATOAGTATGCAGCAATAGAG 7 500 

SVIDSKMFTMGK KVELYEKN 
TCAGTAATTGACTCAAAAATGTTTACCATGGGTAAAAAGGTTGAGTTATATGAGAAAAAT 7 560 

FADLFGSKYAVMVS SGSTAN 
TTTGCTGATTTGTTTGGTAGCAAATATGCCGTAATGGTTAGCTCTGGTTCTA C AGCTAAT 7 62 0 
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LLMIAALFFTNKPKLKRGDE 
CTGTTAATGATOGCTGCCCTTTTCTTCACTAATAAACCiWu^CTTWu T> .CACGTGATGAA 7680 

IIVPAVSWSTTYYPLQQYGL 
ATAATAGTACCTCCAGTGTCATGGTCTACGACATATTACCCTCTGCJ ^ ACAGTATGGCTTA 7740 

KVKFVDINKETLNIDIDS LK 
i ^ u^CGTGAAGTTTGTCGATATC/ ' LATAAAGAAACTTTAAATATTGATATCOATAGTTTGAAA 7800 

NAISDKTKAILTVNLLGNPN 
i ^ u\TGCTATTTCAGATAA/i/iCAAiy > LGCAATATTGACAGTAAAT'rTATTAGGTAATCCTAAT 7 860 

DFAKINEI INNRDIILLEDN 
GATTTTGCJ ^ iAAAATAAATGAGATAATAAATAATAGGGATATTATCTTACTAGAAGATAAC 7 920 

CESMGAVFQNKQAGTFGVMG 
TGTGAGTCGATGGGCGCGGTCTTTCAi a iAATAi\GCAGGCAGGCACATTCGGAGTTATCGGT 



TFSSFYSHHIATMEGGC VVT 
ACCTTTAGTTCTTTTTACTCTCATCATATAGCTACj'u\TGGi T> u\GGGCGCTGCGTAGTTACT 



7980 
8040 



DDEELYHVLLCLRAHGWTRN 
GATGATGAAGAGCTGTATCATGTATTGTTGTGCCTTCGAGCTCATGGTTGGACJJ^GAAAT 8100 

LPKENMVTGTKSDDIFEESF 
TTACCAAA/iGAGAATATGOTTACAGGCACT/ii\GAGTGATGATATTTTCGAAGAGTCGTTT 8160 

KFVLPGYNVRPLEMSGAIGI 
?AGTTTGTTTTACCAGGATAGi T^ u\TGTTCGCCCACTTCiWiTGAGTGGTGCTATTGCG^^^ 8220 

EQLKKLPGFISTRRSNAQYF 
GACCAACTTJu^uXAAGTTACCAGG'TTTTATATCCACCAGACGTTCCAATGCAeAATATTTT 82 80 

VD KFKDHPFLDIQKEVGES S 
GTAGATAAATTTAAAGATCATCCATTCCTTGATATACAAAAAGAAGTTOGTGiWiGTAGC 83 40 

W FGFSFVIKEGAAIERKSLV 
TGGTTTGGTTTTTCCTTCGTTx\TJiAAGGAGGOAGCTGCTATTGAGAGGAi\GAGTTTAGTA 8400 

NNLISAGIECRPIVTGNFLK 
AATAATCTGATCTCAGCAGGCATTGAATGCCGACC/uATTGTTACTGGGAATTTTCTCAAA 8460 

NERVLSYFDYSVHDTVANAE 
J ^ ATGjntACGTCTTTTGAGTTATTTTGATTACTCTQTACATOATACGGTAGCiWiTGC 8520 

YIDKNGFFVGNHQIPLFNEI 
TATATAGAT^u^GAATGGTTTTTTTGTCGGAAACCACCAGATACCTTTGTTTAATGAAATA 8580 

End of orf 7 

DYLRKVLK* 

GATTATCmCGAAAAGTATT/iAAAri^i^CTAACGAGGCACTCTATTOCGAATAGAGTGCCT 8 640 
Start of orfS 

MVLTVKKILAFGYSKVLP 

TTAAGATC G TATTAACAGTGAAAAAAATTTTAGCGTTTGGCTATTCTAAi^GTACTACCAC 87 00 

PVIEQFVNPICIFI ITPLIL 
CGGTTATTGJ ^ J > ,cAGTTTGTCAATCC/u\TTTGCATCTTCATTATCACACCACTAATACTCA 87 60 

NFIL GKQSYGNWILLITIVSF 
ACCACC TGGGTAAGCAAAGCTATGGTAATTGGATTTTATTAATTACTATTGTATCTTTTT 8 82 0 
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SQLICGGCSA WIAKI lAEQR 
CTCAGTTAi\TATGTOGAGGATGTTCCGCATGGATTGCAAAAATCATTGCAGi ^ u " iCAGAGAA 



8880 



ILSDLSKKNALRQISYNFSI 
TTCTTAGTGATTTATCAAAAAAAAATGCTTTACCTCAAATTTCCTATAATTTTTCAATTG 8940 

VIIAFAVLISFLILSICFFD 
TTATTATCGCATTTGCGGTATTGATTTCTTTTCTTATATTAAGTATTTGTTTCTTCGATG 9000 

VARNNSSFLFAI IICGFFQE 
TTCCGAGGAATAATTCTTCATTCTTATTCGCGATTATTATTTGTGG'FTTTTTTCAGGAAG 9 060 

VDNLFSGALKGF EKFNVSCF 
TTGATAATTTA^rpfpaaqV^naiOrnnrm^ 7^ p^ , ;n , qrifpfTirpfp^ , ^ , ^ , ^ ^ ^^^^r^rj^ r j^ . j ^^r^^ ^^^r^^j ^ ^^^^^ ^^^ ^^2^ 

FEVITRVLWAS IVIYGIYGN 
TTGAAGTAAOTACAAGAGTGCTCTGGGCTTCTATAGTT ^ J ' La'ATATGGCATTTACGGAAATG 9180 

ALLYFTCLAFTIKGMLKYIL 
CACTCTTATAT?TTTACATGTTTAGCCTTTACCATTAi » iAGGTATGCTAAAATATATTCTTG 9240 

VCLNITGCFINPNFNRVGIV 
TATGTCT0AATATTACCOGTTGTTTCATC?u\TCCTAi\TTTTAATAGACT'rGGGATTGTTA 9300 

NLLNESKWMFLQLTGGVSLS 
ATTTGTTAAJiTGAGTCAAAATGGATGTTTCTTCAATTAACTGGTGGCGTCTCACTTAGTT 93 60 

LFDRLVIPLILSVSKLASYV 
TCTTTGATAGGCTCGTAATACCATTGATTTTATCTGTCAGTAAACTGGCTTCTTATGTCC 9420 

PCLQLAQLMFTLSASANQIL 
CTTGCCTTCJu\CTAGCTCJ ' iJiTTGATGTTCACTCTTTCTGCGTCTGCAAATCAAATATTAC 9480 

LPMFARMKASNTFPSNCF FK 
a■ACCAATGTTTGCTAGAi^TCiWiGCATCTAACACATT'^CCCTCT/u^TTOTTTTTTTAAAA 9540 

ILLVSLISVLPCLALFFFGR 
TTCTGCTTGTATCACTAATTTCTGTTTTGCCTTGTCTTGCGTTATTCTTTTTTGGTCGTG 9600 

^^LSIWINPTFATENYKLMO 
ATATATTATCiWATGaATiWiCCCTACATTTGCAACTQAAAATTATiWiOT^ ^ 9660 

ILAISYILLSMMTSFHFLLL 
TTTTAGCTATA^AHTrr AP ArpfTirprp a rpmnmn . t^ , , 11 , rpg p^ , rpr^ fn, o r^ i TP TTTT C7 i T TTCTTGTTATTA G 9720 

GIGKSKLVANLNLVAGLALA 
GJ > u\TTGGTAi^ATCTAAGCTTGTTGCAAATTTAAATCTGGTTGCAGGGCTCGCACTTGCTG 9780 

ASTLIAAHYGLYAISMVKI I 
CTTCAACGTTAi\TCaCAGCTCATTATGGCCTTTATGCiWATCTATGGTi ^ u ^ u^^^^ 9840 

YPAFQFYYLYVAFVYFNRAK 
ATCCGGCTTTTCiWTTTAOTACCTTTATGTAGCTTTTGTCTATTTTiWAGAGCGAAAA 9900 



Start of orf9. End of orfS 

MSIDLLFSITEIAIVFSCTI 
N V Y * 

MlGTCTATm^TTTACTTTTTTCAATTACTGAAATCGCJu\TTGTTTTTTCTTCCACTATT 9960 

YIFTQCLLMRRIYLDKSILI 
TACATATTTACTCi\ATGTTTG TTAATGCGGAGGATCTATTTAGATAAAAGTATTTTAATT 1002 0 

LLCLLFFLVI IQLPELNVNG 
CTTTTATGCTTGCTCTTTTTTTTAGTAATC ATTC AACTTCCTGAGCTTAATGTAAACGGT 1008 0 
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LVDSLKLSLPLLMVFIAFQK 
TTGGTCGATTCTTTAAAGTTATCACTGCCTTTATTGATGGTCTTTATCGCTTTTCAAAAA 

PKLCLWVI lALLFLNSAFNF 
CCGAAATTATGCTTQTGGGTTATTATTGCATTGTTGTTTTTGAACTCTGCATTTAATTTT 

LYLKTFDKFS SFPFTFFILL 
TTATATTTAAAGACATTCGATAAGTTTAGCTCATTTCCTTTTACTTTTTTTATATTGCTG 

FYLFRLGIGNLPVYKNKKFY 
TTTTACTTGTTTAGATTGGGAATTGGTAATTTACCGGTTTATAAAAATAAAAAATTTTAC 

ALIFLFILIDIMQSLLINYR 
GCGTTGATTTTTCTCTTTATATTAATAGACATAATGCAGTCATTGTTAATAAATTATAGG 

GQILYSVICILILVFKVNLR 
GGGCAGATTTTATATTCCGTAATTTGCATCCTGATACTTGTGTTTAAAGTTAATTTAAGA 

KKIPYFFLMLPVLYVI IMAY 
AAAAAGATTCCATACTTTTTTTTAATGCTGCCAGTTTTATATGTAATTATTATGGCTTAT 

IGFNYFNKGVTF FEPTASNI 
ATTGGTTTTAATTATTTCAATAAAGGCGTAACTTTTTTTGAACCTACAGCAAGTAATATT 

ERTGMIYYLVSQLGDYIFHG 
GAACGTACGGGGATGATATATTATTTGGTTTCACAGCTTGGTGATTATATATTCCATGGT 

MGTLNFLNNGGQYKTLYGLP 
ATGGGGACATTAAATTTCTTAAATAACGGCGGACAATATAAGACGTTATATGGACTTCCA 

SLIPNDPHDFLLRFFISIGV 
TCATTAATTCCTAATGACCCTCATGATTTTTTATTACGGTTCTTTATAAGTATTGGTGTG 

IGALVYHSIFFVFFRRISFL 
ATAGGAGCATTGGTTTATCATTCTATATTTTTTGTTTTTTTTAGGAGAATATCTTTCTTA 

LYERNAPFIVVSCLLLLQVV 
TTATATGAGAGAAATGCTCCTTTCATTGTTGTAAGTTGTTTGTTACTGTTACAAGTTGTG 

LIYTLNPFDAFNRLICGLTV 
TTAATTTATACATTAAACCCTTTTGATGCTTTTAATCGATTGATTTGCGGGCTTACAGTT 



Start of orflO End of ojrf9 

GVVYGFAKIR* 



GGAGTTGTTTATGGATTTGCAAAAATTAGATAAGTATACCTGTAATGGAAATTTAGACGC 

PLVSI IIATYNSELDIA KCL 
TCCACTTGTTTCAATAATCATTGCAACTTATAATTCTGAACTTGATATAGCTAAGTGTTT 

QSVTNQSYKNIEI IIMDGGS 
GCAATCGGTAACTAATCAATCTTATAAGAATATTGAAATCATAATAATGGATGGAGGATC 

SDKTLDIAKSFKDDRIKIVS 
TTCTGATAAAACGCTTGATATTGCAAAATCGTTTAAAGACGACCGAATAAAAATAGTTTC 

EKDRGIYDAWNKAVDLSIGD 
AGAGAAAGATCGTGGAATTTATGATGCCTGGAATAAAGCAGTTGATTTATCCATTGGTGA 

WVAFIGSDDVYYHTDAIASL 
TTGGGTAGCATTTATTGGTTCAGATGATGTTTACTATCATACAGATGCAATTGCTTCATT 

MKGVMVSNGAPVVYGRTAHE 
GATGAAGGGGGTTATGGTATCTAATGGCGCCCCTGTGGTTTATGGGAGGACAGCGCACGA 



10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 

10980 
11040 
11100 
11160 
11220 
11280 
11340 
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GPDRNISGFSGSEWYNLTGF 
AGGTCCCGATAGGAACATATCTGGATTTTCAGGCAGTGAATGGTACAACCTAACAGGATT 

KFNYYKCNLPLPIMSAIYSR 
TAAGTTTAATTATTACAAATGTAATTTACCATTGCCCATTATGAGC^CAATATATTCTCG 

DFFRNERFDIKLKIVADADW 
TGATTTCTTCAGAAACGAACGTTTTGATATTAAATTAAAAATTGTTGCTGACGCTGATTG 

FLRCFIKWSKEKSPYFINDT 
GTTTCTGAGATGTTTCATCAAATGGAGTAAAGAGAAGTCACCTTATTTTATTAATGACAC 

TPIVRMGYGGVSTDIS SQVK 
GACCCCTATTGTTAGAATGGGATATGGTGGGGTTTCGACTGATATTTCTTCTCAAGTTAA 

TTLESFIVRKKNNISCLNIQ 
AACTACGCTAGAAAGTTTCATTGTACGCAAAAAGAATAATATATCCTGTTTAAACATACA 

LILRYAKILVMVAI KNIFGN 
GCTGATTCTTAGATATGCTAAAATTCTGGTGATGGTAGCGATCAAAAATATTTTTGGCAA 

NVYKLMHNGYHSLKKIKNKI 
TAATGTTTATAAATTAATGCATAACGGGTATCATTCCCTAAAGAAAATCAAGAATAAAAT 



Staart of orfll. End of orflO 

MKIVYI ITGLTCGGAEHLMT 
★ 

ATSAAGATTGTTTATATAATAACCGGGCTTACTTGTGGTGGAGCCGAACACCTTATGACG 11880 

QLADQMFIRGHDVNI ICLTG 
CAGTTAGCAGACCAAATGTTTATACGCGGGCATGATGTTAATATTATTTGTCTAACTGGT 11940 

I SEVKPTQNI N I HYVNMDK N 
ATATCTG AGGTAAAGCCAACACAAAATATTAATATTCATTATGTTAATATGGATAAA7LAT 12000 

FRSFFRALFQVKKI I'VALKP 
TTTAGAAGCTTTTTTAGAGCTTTATTTCAAGTAAAAAAAATAATTGTCGCCTTAAAGCC A 12060 

DIIHSHMFHANIFSRFIRML 
GATATAATACATAGTCATATGTTTCATGCTAATATTTTTAGTCGTTTTATTAGGATGCTG 12120 

IPAVPLICTAHNKNEGGNAR 
ATTCCAGCGGTGCCCCTGATATGTACCGCACACAACAAAAATGAAGGTGGCAATGCAAGG 12180 

MFCYRLSDFLAS IT TNVSKE 
ATGTTTTGTTATCG ACTG AGTGATTTTTTAGCTTCTATTACTAC AAATGTAAGTAAAGAG 12240 

AVQEFIARKATPKNKIVEIP 
GCTGTTC AAGAGTTTATAGCAAGAAAGGCTACACCTAAAAATAAAATAGTAGAGATTCCG 123 00 

NFINTNKFDFDINVRK KTRD 
AATTTTATTAATAC AAATAAATTTGATTTTGATATTAATGTCAGAAAGAAAACGCGAGAT 12360 

AFNLKDSTAVL LAVGRLVEA 
GCTTTTAATTTGAAAGACAGTACAGCAGTACTGCTCGCAGTAGGAAGACTTGTTGAAGCA 12420 

KDYPNLLNAINHLILSKTSN 
AAAGACTATCCG AACTTATTAAATGCAATAAATC ATTTGATTCTTTCAAAAACATCAAAT 12480 

CNDFILLIAGDGALRNKLLD 
TGTAATGATTTTATTTTGCTTATTGCTGGCGATGGCGC ATTAAG AAATAAATTATTGG AT 12540 

LVCQLNLVDKVF FLGQRSDI 
TTGGTTTGTC AATTG AATCTTGTGGAT AAAGTTTTCTTCTTGGGGC AAAG AAGTGATATT 12600 



11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
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KELMCAADLFVLSSEWEGFG 
AAAGAATTAATGTGTGCTGCAGATCTTTTTGTTTTGAGTTCTGAGTGGGAAGGTTTTGGT 

LVVAEAMACERPVVATDSGG 
CTCGTTGTTGCAGAAGCTATGGCGTGTGAACGTCCCGTTGTTGCTACCGATTCTGGTGGA 

VKEVVGPHNDVIPVSNHILL 
GTTAAAGAAGTCGTTGGACCTCATAATGATGTTATCCCTGTCAGTAATCATATTCTGTTG 

AEKIAETLKIDDNARKI IGM 
GCAGAGAAAATCGCTGAGACACTTAAAATAGATGATAACGCAAGAAAAATAATAGGTATG 

KNREYIVSNFSIKTIVSEWE 
AAAAATAGAGAATATATTGTTTCCAATTTTTCAATTAAAACGATAGTGAGTGAGTGGGAG 



CGCTTATATTTTAAATATTCCAAGCGTAATAATATAATTGATrGA?UVATATAAGTTTGTA 
CTCTGGATGCAATAGTTTCTCTATGCTGTTTTTTTACTGGCTCCGTATTTTTACTTATAG 
CTGGATTTTGTTATATATCAGTATTAATCTGTCTCAACTTCATCTAGACTACATTCAAGC 



Start o£ gnd 

M S K Q Q I 

CGCGCATGCGTCGCGCGGTGACTACACCTGACAGGAGTATGTAATGTCCAAGCAACAGAT 

GVVGMAVMGRNLALNI ESRG 
CGGCGTCGTCGGTATGGCAGTGATGGGGCGCAACCTGGCGCTCAACATCGAAAGCCGCGG 

YTVSIFNRSREKTEEVVAEN 
TTATACCGTCTCCATCTTCAACCGCTCCCGCGAGAAAACTGAAGAAGTTGTTGCCGAGAA 

PDKKLVPYYTVKEFVESLET 
CCCGGATAAGAAACTGGTTCCTTATTACACGGTGAAAGAGTTCGTCGAGTCTCTTGAAAC 

PRRILLMVKAGAGTDAAIDS 
CCCACGTCGTATCCTGTTAATGGTAAAAGCAGGGGCGGGAACTGATGCTGCTATCGATTC 

LKPYLDKGDI IIDGGNTFFQ 
CCTGAAGCCGTATCTGGATAAAGGCGACATCATTATTGATGGTGGCAACACCTTCTTCCA 

DTIRRNRELSAEGFNFIGTG 
GGACACTATCCGTCGTAACCGTGAACTGTCCGCGGAAGGCTTTAACTTCATCGGTACCGG 

VSGGEEGALKGPSIMPGGQK 
CGTGTCCGGCGGTGAAGAGGGCGCCCTGAAAGGCCCATCTATCATGCCAGGTGGCCAGAA 

EAYELVAPILTKIAAVAEDG 
AGAAGCGTATGAGCTGGTTGCGCCTATCCTGACCAAGATTGCTGCGGTTGCTGAAGATGG 

EPCITYIGADGAGHYVKMVH 
CGAACCATGTATAACTTACATCGGTGCTGACGGTGCGGGTCACTACGTGAAGATGGTGCA 

NGIEYGDMQLIAEAYSL LKG 
CAACGGTATCGAATATGGCGATATGCAGCTGATTGCTGAAGCCTATTCTCTGCTTAAAGG 

GLNLSNE ELATTFTEWNEGE 
CGGCCTTAATCTGTCTAACGAAGAGCTGGCAACCACTTTTACCGAGTGGAATGAAGGCGA 

LSSYLIDITKDIFTKKDEEG 
GCTAAGTAGCTACCTGATTGACATCACCAAAGACATCTTCACCAAAAAAGATGAAGAGGG 



12660 
12720 
12780 
12840 
12900 

12960 
13020 
13080 

13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
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KYLVDVILDEAANKGTGKWT 
TAAATACCTGGTTGATGTGATCCTGGACGAAGCTGCGAACAAAGGCACCGGTAAATGGAC 

SQSSLDLGEPLSLITESVFA 
CAGCCAGAGCTCTCTGGATCTGGGTGAACCGCTGTCGCTGATCACCGAATCCGTATTCGC 

RYISSLKDQRIAASKVLSGP 
TCGCTACATCTCTTCTCTGAAAGACCAGCGCATTGCGGCATCTAAAGTGCTGTCTGGTCC 

QAKLAGDKAEFVEKVR RALY 
GCAGGCTAAACTGGCTGGTGATAAAGCAGAGTTCGTTGAGAAAGTCCGTCGCGCGCTGTA 

LGKIVSYAQGFSQLRAASDE 
CCTGGGTAAAATCGTCTCTTATGCCCAAGGCTTCTCTCAACTGCGTGCCGCGTCTGACGA 

YNWDLNYGEIAKIFRAGCI I 
ATACAACTGGGATCTGAACTACGGCGAAATCGCGAAGATCTTCCGCGCGGGCTGCATCAT 

RAQFLQKITDAYAENKGIAN 
TCGTGCGCAGTTCCTGCAGAAAATTACTGACGCGTATGCTGAAAACAAAGGCATTGCTAA 

LLLAPYFKNIADEYQQALRD 
CCTGTTGCTGGCTCCGTACTTCAAAAATATCGCTGATGAATATCAGCAAGCGCTGCGTGA 

VVAYAVQNGI PVPTFSA AVA 
TGTAGTGGCTTATGCTGTGCAGAACGGTATTCCGGTACCGACCTTCTCTGCAGCGGTAGC 

YYDSYRSAVL PANLIQAQRD 
CTACTACGACAGCTACCGTTCTGCGGTACTGCCGGCTAATCTGATTCAGGCACAGCGTGA 

YFGAHTYKRTDKEGVFHTG 
TTACTTCGGTGCGCACACGTATAAACGCACTGATAAAGAAGGTGTGTTCCACACCG 1^ 



13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
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GTAACCAAGGGCGGTACGTGCATAAATTTTAATGCTTATCAAAACTATTAGCATTAAAAA 



Start of orfl 

MNKETVSI IMPVYN 
TATATAAGAAATTCTCAAATGAAGAAAGAAACCGTTTCAATAATTATGCCCGTTTACAAT 

GAKTI ISSVESIIHQSYQDF 
GGGGCCAAAACTATAATCTCATCAGTAGAATCAATTATACATCAATCTTATCAAGATTTT 

VLYI IDDCSTDDTFSLINSR 
GTTTTGTATATCATTGACGATTGTAGCACCGATGATACATTTTCATTAATCAACAGTCGA 

YKNNQKIRILRNKTNLGVAE 
TACAAAAACAATCAGAAAATAAGAATATTGCGTAACAAGACAAATTTAGGTGTTGCAGAA 

SRNYGIEMATGKYISFCDAD 
AGTCGAAATTATGGAATAGAAATGGCCACGGGGAAATATATTTCTTTTTGTGATGCGGAT 

DLWHEKKLERQIEVLNNECV 
GATTTGTGGCACGAGAAAAAATTAGAGCGTCAAATCGAAGTGTTAAATAATGAATGTGTA 

DVVCSNYYVIDNNRNIVGEV 
GATGTGGTATGTTCTAATTATTATGTTATAGATAACAATAGAAATATTGTTGGCGAAGTT 

NAPHVINYRKMLMKNYIGNL 
AATGCTCCTCATGTGATAAATTATAGAAAAATGCTCATGAAAAACTACATAGGGAATTTG 

TGIYNANKLGKFYQKKIGHE 
ACAGGAATCTATAATGCCAACAAATTGGGTAAGTTTTATCAAAAAAAGATTGGTCACGAG 

DYLMWLEI INKTNGAICIQD 
GATTATTTGATGTGGCTGGAAATAATTAATAAAACAAATGGTGCTATTTGTATTCAAGAT 

NLAYYMRSNNSLSGNKIKAA 
AATCTGGCGTATTACATGCGTTCAAATAATTCACTATCGGGTAATAAAATTAAAGCTGCA 

KWTWSIYREHLHLSFPKTLY 
AAATGGACATGGAGTATATATAGAGAACATTTACATTTGTCCTTTCCAAAAACATTATAT 

YFLLYASNGVMKKITHSLLR 
TATTTTTTATTATATGCTTCAAATGGAGTCATGAAAAAAATAACACATTCACTATTAAGG 



Start of orf2. End of orfl 

R K E T K K * 



AGAAAGGAGACTAAAAAGTGAAGTCAGCGGCTAAGTTGATTTTTTTATTCCTATTTACAC 

LYSLQLYGVI IDDRITNFDT 
TTTATAGTCTCCAGTTGTATGGGGTTATCATAGATGATCGTATAACAAATTTTGATACAA 

KVLTSIIIIFQIFFVLLFYL 
AGGTATTAACTAGTATTATAATTATATTTCAGATTTTTTTTGTTTTATTATTTTATCTAA 

TIINERKQQKKFIVNWELKL 
CGATTATAAATGAAAGAAAACAGCAGAAAAAATTTATCGTGAACTGGGAGCTAAAGTTAA 

ILVFLFVTIEIAAVVLFLKE 
TACTCGTTTTCCTTTTTGTGACTATAGAAATTGCTGCTGTAGTTTTATTTCTTAAAGAAG 

*^^P^FDDDPGGAKLRIAEGN 
GTATTCCTATATTTGATGATGATCCAGGGGGGGCTAAACTTAGAATAGCTGAAGGTAATG 



60 

120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 

900 
960 
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GLYIRYIKYFGNIVVFALI I 
GACTTTACATTAGATATATTAAGTATTTTGGTAATATAGTTGTGTTTGC ATTAATTATTC 1260 

L YDEHKFKQRTI IFVYFTTI 
TTTATGATGAGCATAAATTCAAACAGAGGACCATCATATTTGTATATTTTACAACGATTG 13 20 

ALFGYRSELVL LILQY I L-IT 
CTTTATTTGGTTATCGTTCTGAATTGGTGTTGCTCATTCTTCAATATATATTGATTACCA 13 80 

NILSKDNRNPKIKRI IGYFL 
ATATCCTGTCAAAGGATAACCGTAATCCTAAAAT AAAAAG AATAATAGGGTATTTTTTAT 1440 

LVGVVCSLFYLSLGQDGEQN 
TGGTAGGGGTTGTATGCTCGTTGTTTTATCTAAGTTTAGGACAAGACGGAGAACAAAATG 1500 

DSYNNMLRIINRLTIEQVEG 

ACTC ATATAATAATATGTTAAGGATAATTAATAGGTTAAC AATAGAGC AAGTTGAAGGTG 1560 

VPYVVSESIKNDFFPTP ELE 
TTCCATATGTTGTTTCTGAATCTATTAAGAACGATTTCTTTCCGACACCAGAGTTAGAAA 1620 

KELKAI INRIQGIKHQDLFY 
AGGAATTAAAAGCAATAATAAATAGAATACAGGGAATAAAGCATCAAGACTTATTTTATG 1680 

GERLHKQV FGDMGAN FL SVT 
GAGAACGGTTACATAAAC AAGTATTTGGAGAC ATGGG AGC AAATTTTTTATC AGTTACTA 1740 

TYGAELLVFFGFLCVFI IPL 
CGTATGGAGCAGAACTGTTAGTTTTTTTTGGTTTTCTCTGTGTATTCATTATCCCTTTAG 1800 

GIYIPFYLLKRMKKTHSSIN 
GGATATATATACCTTTTTATCTTTTAAAGAGAATGAAAAAAACCC ATAGCTCGATAAATT 1860 

CAFYSYI IMILLQYLVAGNA 

GCGC ATTCTATTCATATATC ATTATGATTTTATTGC AATACTTAGTGGCTGGGAATGC AT 1920 

SAFFFGPFLSVLIMCTPLIL 
CGGCCTTCTTTTTTGGTCCTTTTCTCTCCGTATTGATAATGTGTACTCCTCTGATCTTAT 1980 



St:art o£ or£3 

MKISVITVTY 
LHDTLKRLSRNENI SYNCDL 
TGCATGATACGTTAAAGAGATTATCACGAA ATGAAAATATCAGTTATAACTGTGACTTA T 2040 



End o£ or £2 

NNAEGLEKTLS SLS ILKIKP 

* 

AATAATGCTGAAGGGTTAGAAAAAACTTTAAGTAGTTTATC AATTTTAAAAATAAAACCT 2100 

FEIIIVDGGSTDGTNRVISR 
TTTGAGATTATTATAGTTGATGGCGGCTCTACAGATGGAACGAATCGTGTC ATTAGTAGA 2160 

FTSMNITHVYEKDEG lYDAM 
TTTACTAGTATGAATATTAC ACATGTTTATGAAAAAGATGAAGGGATATATGATGCGATG 2220 

NKGRMLAKGDLIHYLNAGDS 
AATAAGGGCCGAATGTTGGCCAAAGGCGACTTAATACATTATTTAAACGCCGGCGATAGC 2280 

VIGDIYKNIKEPCLIKVGLF 
GTAATTGGAG ATATATATAAAAATATC AAAGAGCC ATGTTTG ATTAAAGTTGGCCTTTTC 2340 

ENDKLLGFSSITHSNTGYCH 
GAAAATG ATAAACTTCTGGGATTTTCTTCTATAACCC ATTC AAATAC AGGGTATTGTC AT 2 400 
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QGVIFPKNHSEYDLRYKICA 
CAAGGGGTGATTTTCCCAAAGAATCATTCAGAATATGATCTAAGGTATAAAATATGTGCT 2 460 

DYKLIQEVFPEGLRSLSLIT 
GATTATAAGCTTATTCAAGAGGTGTTTCCTGAAGGGTTAAGATCTCTATCTTTGATTACT 2520 

SGYVKYDMGGVS SKKRILRD 
TCGGGTTATGTAAAATATGATATGGGGGGAGTATCTTCAAAAAAAAGAATTTTAAGAGAT 2 580 

KELAKIMFEKNKKNLIKFIP 
AAAGAGCTTGCCAAAATTATGTTTGAAAAAAATAAAAAAAACCTTATTAAGTTTATTCCA 2640 

ISI IKILFPERLRRVLRKMQ 
ATTTC AATAATCAAAATTTTATTCCCTGAACGTTTAAGAAGAGTATTGCGGAAAATGCAA 2700 



Start: o£ or£4 End o£ or£3 

YICLTLFFMKNSSPYDNE* 

M I M N K I 

TATATTTGTCTAACTTTATTCTTCATGAAGAATAGTTCACCATATGATAATGAATAAAAT 27 60 

KKILKFCTLKKYDTSSALGR 
CAAAAAAATACTTAAATTTTGCACTTTAAAAAAATATGATACATCAAGTGCTTTAGGTAG 2 82 0 

EQERYRI ISLSVISSLISKI 
AGAACAGGAAAGGTACAGGATTATATCCTTGTCTGTTATTTCAAGTTTGATTAGTAAAAT 2 880 

LSLLSLILTVSLTLPYLGQE 
ACTCTCACTACTTTCTCTTATATTAACTGTAAGTTTAACTTTACCTTATTTAGGACAAGA 2940 

RFGVWMTITSLGAALTFLDL 
GAGATTTGGTGTATGGATGACTATTACCAGTCTTGGTGCTGCTCTGACATTTTTGGACTT 3 000 

GIGNALTNRIAHSFACGKNL 
AGGTATAGGAAATGC ATTAACAAACAGGATCGC ACATTCATTTGCGTGTGGCAAAAATTT 3 060 

KMSRQI SGGLTLLAGLSFVI 
AAAGATGAGTCGGC AAATTAGTGGTGGGCTC ACTTTGCTGGCTGGATTATCGTTTGTCAT 312 0 

TAICYITSGMIDWQLVIKGI 
AACTGCAATATGCTATATTACTTCTGGCATG ATTGATTGGC AACTAGTAATAAAAGGTAT 3180 

NENVYAELQHSIKVFVI IFG 
AAACGAGAATGTGTATGCAGAGTTAC AAC ACTCAATTAAAGTCTTTGTAATCATATTTGG 3240 

LGIYSNGVQKVYMGIQKAYI 
ACTTGGAATTTATTC AAATGGTGTGCAAAAAGTTTATATGGGAATACAAAAAGCCTATAT 3 300 

SNIVNAIFIL LSIITLVISS 
AAGTAATATTGTTAATGCCATATTTATATTGTTATCTATTATTACTCTAGTAATATCGTC 3 3 60 

KLHAGLPVLIVSTLGIQYIS 
GAAACTAC ATGCGGGACTACCAGTTTTAATTGTC AGCACTCTTGGTATTCAATACATATC 3 420 

GIYLTINLI IKRLIKFTKVN 
GGGAATCTATTTAAC AATTAATCTTATTATAAAGCGATTAATAAAGTTTACAAAAGTTAA 3 480 

IHAKREAPYLILNGF FFFIL 
C ATAC ATGCTAAAAGAGAAGCTCC ATATTTG ATATTAAACGGTTTTTTCTTTTTTATTTT 3 540 

QLGTLATWSGDNFI ISITLG 
AC AGTTAGGCACTCTGGCAAC ATGGAGTGGTGATAACTTTATAATATCTATAAC ATTGGG 3 600 
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VTYVAVFS ITQRLFQI STVP 
TGTTACTTATGTTGCTGTTTTTAGCATTACACAGAGATTATTTCAAATATCTACGGTCCC 3 660 

LTIYNI PLWAAYADAHARND 
TCTTACGATTTATAACATCCCGTTATGGGCTGCTTATGC AGATGCTCATGC ACGC AATGA 3720 

TQFIKKTLRTSLKIVGIS SF 
TACTCAATTTATAAAAAAGACGCTCAGAACATC ATTGAAAATAGTGGGTATTTC ATCATT 3 780 

LLAFILVVFGSEVVNIWTEG 
CTTATTGGCCTTCATATTAGTAGTGTTCGGTAGTG AAGTCGTTAATATTTGGACAGAAGG 3 840 

KIQVPRTFI lAYALWSVIDA 
AAAGATTCAGGTACCTCGAACATTCATAATAGCTTATGCTTTATGGTCTGTTATTGATGC 3 900 

FSNTFASFLNGLNIVKQ QML 
TTTTTCGAATACATTTGCAAGCTTTTTAAATGGTTTGAACATAGTTAAAC AAC AAATGCT 3 960 

AVVTLILIAIPAKYIIVSHF 
TGCTGTTGTAACATTGATATTGATCGCAATTCCAGCAAAATACATCATAGTTAGCCATTT 4020 

GLTVMLYCFIFIYIVNYFIW 
T GGGTTAACTGTTATGTTGTACTGGrrTCATTTTTATATATATTGTAAATTACTTTATATG 4080 



Start o£ or£5. End o£ or£4 

M K M 

YKCSFKKHIDRQLNIRG* 
GTATAAATGTAGTTTTAAAAAAOATATCGATAGACAGTTAAATATAAGAGG ATC AAAA 'PG 4140 

KYIPVYQPSLTGKEKEYVNE 
AAATATATACCAGTTTACCAACCGTCATTGACAGGAAAAGAAAAAGAATATGTAAATGAA 4200 

CLDSTWISSKGNYIQKFENK 
TGTCTGGACTCAACGTGGAT'FrCATCAAAAGGAAACTATATTCAGAAGTTTGAAAATAAA 42 60 

FAEQNHVQYATTVSNGTVAL 
TTTCCGGAACAAAACCATGTGCAATATGCAACTACTGTAAGTAATGGAACGOTTGCTCTT 4320 

HLALLALGISEGDEVIVPTL 
CATTTAGCTTTGTTAGCGTTAGGTATATCGGAAGGAGATGAAGTTATTGTTCCAACACTG 43 80 

TYIASVNAIKYTGATPIFVD 
ACATATATAGCATCAGTTAATGCTATAAAATACACAGGAGCCACCCCCATTTTCGTTGAT 4440 

SDNETWQMSVSDIEQKITNK 
TCAGATAATGAAACTTGGCAAATGTCTGTTAGTGACATAGAACAAAAAATCACTAATAAA 4500 

TKAIMCVHLYGHPCDMEQIV 
ACTAAAGCTATTATGTGTGTCCATTTATACGGACATCCATGTGATATGGAACAAATTGTA 4560 

ELAKSRNLFVIEDCAEAFGS 
GAACTGGCCAAAAGTAGAAATTTGTTTGTAATTGAAGATTGCGCTGAAGCCTTTGOTTCT 4620 

KYKGKYVGTFGDISTFSF FG 
AAATATAAAGGTAAATATGTGGGAACATTTGGAGATATTTCTACTTTTAGCTTTTTTGGA 4 680 

NKTITTGEGGMVVTNDKTLY 
AATAAAACTATTACTACAGGTGAAGGTGGAATGGTTGTCACGAATGACAAAACACTTTAT 47 40 

DRCLHFKGQGLAVHRQYWHD 
GACCGTTGTTTACATTTTAAAGGCCAAGGATTAGCTGTACATAGGCAATATTGGCATGAC 4 800 

VIGYNYRMTNICAAIGLAQL 
GTTATAGGCTACAATTATAGGATGACAAATATCTGCGCTGCTATAGGATTAGCCCAGTTA 4 860 
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EQADDFISRKREIADIYKKN 
GAACAAGCTGATOATTTTATATCACGAAAACGTGAAATTGCTGATATTTATAAAAAAAAT 4 920 

INSLVQVHKESKDVFHTYWM 
ATCAACAGTCTTGTACAAGTCCACAAGGAAAGTAAAGATGTTTTTCACACTTATTGGATG 4 980 

VSILTRTAEEREELRNHLAD 
GTCTCAATTCTAACTAGGACCGCAGAGGAAAGAGAGGAATTAAGGAATCACCTTGCAGAT 5040 

KLIETRPVFYPVHTMPMYSE 
AAACTCATCGAAACAAGGCCAGTTTTTTACCCTGTCCACACGATGCCAATGTACTCGGAA 5100 

KYQKHPIAEDLGWRGINLPS 
AAATATCAAAAGCACCCTATAGCTGAGGATCTTGGTTGGCGTGGAATTAATTTACCTAGT 5160 

FPSLSNEQVIYICESINEFY 
TTCCCCAGCCTATCGAATGAGCAAGTTATTTATATTTGTGAATGTATTAACGAATTTTAT 5220 



End o£ or£5 Start o£ or£6 

SDK* MKIALNSD 
AGTGATAAATAGCCTAAAATATTGTAAAGGTCATTCATOAAAATTGCGTTGAATTCAGAT 52 80 

GFYEWGGGIDFIKYILSILE 
GGATTTTACGAGTGGGGCGGTGGAATTG ATTTTATTAAATATATTCTGTCAATATTAGAA 5340 

TKPEICIDILLPRNDIHSLI 
ACGAAACC AG AAATATGTATCGATATTCTTTTACCGAGAAATG ATATAC ATTCTCTTATA 5400 

REKAFPFKSILKAILKRERP 
AGAGAAAAAGC ATTTCCTTTTAAAAGTATATTAAAAGCAATTTTAAAGAGGGAAAGGCCT 5460 

RWISLNRFNEQYYRDAFTQN 
CG ATGGATTTC ATTAAATAG ATTTAATGAGC AATACTATAGAG ATGCCTTTAC AC AAAAT 5520 

NIETNLTFIKSKS SAFYSYF 
AATATAGAGACGAATCTTACCTTTATTAAAAGTAAGAGCTCTGCCTTTTATTCATATTTT 5580 

DSSDCDVILPCMRVPSGNLN 
GATAGTAGCGATTGTG ATGTTATTCTTCCTTGCATGCGTGTTCCTTCGGGAAATTTGAAT 5640 

KKAWIGYIYDFQHCYYPSFF 
AAAAAAGCATGGATTGGTTATATTTATGACTTTCAACACTGTTACTATCCTTCATTTTTT 5700 

SKREIDQRNVF FKLMLNCAN 
AGTAAGCGAGAAATAGATCAAAGGAATGTGTTTTTTAAATTGATGCTC AATTGCGCTAAC 5760 

NI IVNAHSVITDANKYVGNY 
AATATTATTGTTAATGCACATTCAGTTATTACCGATGC AAATAAATATGTTGGGAATT AT 5820 

SAKLHSLPFSPCPQLKWFAD 
TCTGCAAAACTACATTCTCTTCCATTTAGTCCATGCCCTCAATTAAAATGGTTCGCTGAT 5880 

YSGNIAKYNIDKDYFI ICNQ 
TACTCTGGTAATATTGCC AAATATAATATTGACAAGGATTATTTTATAATTTGCAATC AA 5940 

FWKHKDHATAFRAFKIYTEY 
TTTTGGAAAC ATAAAG ATC ATGC AACTGCTTTTAGGGCATTTAAAATTTATACTGAATAT 6000 

NPDVYLVCTGATQDYRFPGY 
AATCCTGATGTTTATTTAGTATGCACGGG AGCTACTCAAGATTATCGATTCCCTGGATAT 6060 

FNELMVLAKKLGIESKIKIL 
TTTAATGAATTGATGGTTTTGGC AAAAAAGCTCGGAATTGAATCGAAAATTT^GATATTA 612 0 
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GHIPKLEQIELIKNCIAVIQ 
GGGC ATATACCTAAACTTGAACAAATTG AATTAATCAAAAATTGCATTGCTGTAATACAA 6180 

PTLFEGGPGGGVTFDAIALG 
CC AACCTTATTTGAAGGCGGGCCTGGAGGGGGGGTAACATTTGACGCTATTGCATTAGGa 6240 

KKVI LSDIDVNKEVNCGDVY 
AAAAAAGTT ATACTATCTGACATAGATGTC AATAAAGAAGTTAATTGCGGTG ATGTATAT 63 00 

FFQAKNHYSLNDAMVKADES 
TTCTTTCAGGCAAAAAACC ATTATTCATTAAATGACGCGATGGTAAAAGCTGATGAATCT 63 60 

KIFYEPTTLIELGLKRRNAC 
AAAATTTTTTATGAACCTACAACTCTGATAGAATTGGGTCTCAAAAGACGCAATGCGTGT 6420 

End o£ or£€ 

ADFLLDVVKQEIESRS* 
GCAGATTTTCTTTTAGATGTTGTGAAACAAGAAATTGAATCCCGATCT TAATATATTCAA 6480 



Start of orf7 

MTKVALITGVTGQDGSY 
GAGGTATATAATGACTAAAGTCGCTCTTATTACAGGTGTAACTGGACAAGATGGATCTTA 6540 

LAEFLLDKGYEVHGIKRRAS 
TCTAGCTGAGTTTTTGCTTGATAAAGGGTATGAAGTTCATGGTATCAAACGCCGAGCCTC 6600 

SFNTERIDHIYQDPHGSNPN 
ATCTTTTAATACAGAACGCATAGACCATATTTATCAAGATCCACATGGTTCTAACCCAAA 6660 

FHLHYGDLTDS SNLTRILKE 
TTTTCACTTGC ACTATGGAGATCTGACTGATTCATCTAACCTCACTAGAATTCTAAAGGA 672 0 

VQPDEVYNLAAMSHVAVSFE 
GGTACAGCCAGATGAAGTATATAATTTAGCTGCTATGAGTCACGTAGCAGTTTCTTTTGA 6780 

SPEYTADVDAIGTLRL LEAI 
GTCTCCAGAATATACAGCCGATGTCGATGC AATTGGTACATTACGTTTACTGGAAGCAAT 6840 

RFLGLENKTRFYQASTSELY 
TCGCTTTTTAGGATTGGAAAACAAAACGCGTTTCTATCAAGCTTCAACCTCAGAATTATA 69 00 

GLVQEIPQKESTPFYPRSPY 
TGGACTTGTTCAGGAAATCCCTCAAAAAGAATCCACCCCTTTTTATCCTCGTTCCCCTTA 6960 

AVAKLYAYWITVNYRESYGI 
TGCAGTTGCAAAACTTTACGCATATTGGATC ACGGTAAATTATCGAGAGTCATATGGTAT 702 0 

YACNGILFNHESPRRGETFV 
TTATGCATGTAATGGTATATTGTTCAATCATGAATCTCC ACGCCGTGGAGAAACGTTTGT 7080 

TRKITRGLANIAQGLESCLY 
AACAAGGAAAATTACTCGAGGACTTGCAAATATTGCACAAGGCTTGGAATCATGTTTGTA 7140 

LGNMDSLRDWGHAKDYVRMQ 
TTTAGGG AATATGGATTCGTTACG AGATTGGGGAC ATGC AAAAG ATTATGTTAG AATGCA 72 00 

WLMLQQEQPEDFVIATGVQY 
ATGGTTGATGTTACAACAGGAGCAACCCGAAGATTTTGTGATTGCAACAGGAGTCCAATA 72 6 0 

SVRQFVEMAAAQLGIKMSFV 
CTCAGTCCGTC AGTTTGTCGAAATGGCAGCAGC ACAACTTGGT ATTAAGATGAGCTTTGT 73 2 0 
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*^J^GIEEKGIVDSVEGQDAPG 
TGGTAAAGGAATCGAAGAAAAAGGCATTGTAGATTCGGTTGAAGGACAGGATGCTCCAGG 

VKPGDVIVAVDPRYFRPAEV 
TGTGAAACCAGGTGATGTCATTGTTGCTGTTGATCCTCGTTATTTCCGACCAGCTGAAGT 

DTLLGDPSKANLKLGWRPEI 
TGATACTTTGCTTGGAGATCCGAGCAAAGCTAATCTCAAACTTGGTTGGAGACCAGAAAT 

TLAEMISEMVAKDLEAAKKH 
TACTCTTGCTGAAATGATTTCTGAAATGGTTGCCAAAGATCTTGAAGCCGCTAAAAAACA 



SLLKSHGFSVSLALE* 
TTCTCTTTTAAAATCGCATGGTTTTTCTGTAAGCTTAGCTCTGGAATGATGATGAATAAG 

QRIFIAGHQGMVGSAITR RL 
CAACGTATTTTTATTGCTGGTCACCAAGGAATGGTTGGATCAGCTATTACCCGACGCCTC 

KQRDDVELVLRTRDELNL LD 
AAACAACGTGATGATGTTGAGTTGGTTTTACGTACTCGGGATGAATTGAACTTGTTGGAT 

SSAVLDFFSSQKIDQVYLAA 
AGTAGCGCTGTTTTGGATTTTTTTTCTTCACAGAAAATCGACCAGGTTTATTTGGCAGCA 

AKVGGILANS SYPADFIYEN 
GCAAAAGTCGGAGGTATTTTAGCTAACAGTTCTTATCCTGCCGATTTTATATATGAGAAT 

IMIEANVIHAAHKNNVNKLL 
ATAATGATAGAGGCGAATGTCATTCATGCTGCCCACAAAAATAATGTAAATAAACTGCTT 

FLGSSCIYPKLAHQPIMEDE 
TTCCTCGGTTCGTCGTGTATTTATCCTAAGTTAGCACACCAACCGATTATGGAAGACGAA 

LLQGKLEPTNEPYAIAKIAG 
TTATTACAAGGGAAACTTGAGCCAACAAATGAACCTTATGCTATCGCAAAAATTGCAGGT 

IKLCESYNRQFGRDYRSVMP 
ATTAAATTATGTGAATCTTATAACCGTCAGTTTGGGCGTGATTACCGTTCAGTAATGCCA 

TNLYGPNDNFHPSNSHVI PA 
ACCAATCTTTATGGTCCAAATGACAATTTTCATCCAAGTAATTCTCATGTGATTCCGGCG 

LLRRFHDAVENNSPNVVVWG 
CTTTTGCGCCGCTTTCATGATGCTGTGGAAAACAATTCTCCGAATGTTGTTGTTTGGGGA 

SGTPKREFLHVDDMASASIY 
AGTGGTACTCCAAAGCGTGAATTCTTACATGTAGATGATATGGCTTCTGCAAGCATTTAT 

VMEMPYDIWQKNT KVMLSHI 
GTCATGGAGATGCCATACGATATATGGCAAAAAAATACTAAAGTAATGTTGTCTCATATC 

NIGTGIDCTI CELAETIAKV 
AATATTGGAACAGGTATTGACTGCACGATTTGTGAGCTTGCGGAAACAATAGCAAAAGTT 

VGYKGHITFDTTKPDGAPRK 
GTAGGTTATAAAGGGCATATTACGTTCGATACAACAAAGCCCGATGGAGCCCCTCGAAAA 

LLDVTLLHQLGWNHKITLHK 
CTACTTGATGTAACGCTTCTTCATCAACTAGGTTGGAATCATAAAATTACCCTTCACAAG 



7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 



Start of orfS, End of orf7 

M M M N K 
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Kzid of orf. 8 

G LENTYNWFLENQLQYRG * 
GGTCTTG AAAATACATACAACTGGTTTCTTGAAAACCAACTTCAATATCGGGGG TAATAA 8580 



Start of or-f 9 

M-FLHSQDFATIVRSTPLISI 
^IGTTTTTACATTCCCAAGACTTTGCCACAATTGTAAGGTCTACTCCTCTTATTTCTATAG 



AQGYWFVPGGRVLKDEKLQT 
CACAGGGCTATTGGTTCGTTCCTGGTGGTAGGGTGTTGAAAGATGAAAAATTGCAGACAG 



TSGNHYWNSGIFMFKASVYL 
GACTTCTGGTAATCACTATTGGAATAGTGGAATATTCATGTTTAAGGCATCTGTTTATCT 



8640 



DLIVENEFGEILLGKRINRP 
ATTTGATTGTGGAAAACGAGTTTGGCGAAATTTTGCTAGGAAAACGAATCAACCGCCCGG 8700 



8760 



AFERLTEIELGIRLPLSVGK 
CCTTTGAACGATTGACAGAAATTGAACTAGGAATTCGTTTGCCTCTCTCTGTGGGTAAGT 8820 

FYGIWQHFYEDNSMGGDFST 
TTTATGGTATCTGGCAGCACTTCTACGAAGACAATAGTATGGGGGGAGACTTTTCAACGC 8880 

HYIVIAFLLKLQPNILKLPK 
ATTATATAGTTATAGCATTCCTTCTTAAATTACAACCAAACATTTTGAAATTACCGAAGT 8940 

SQHNAYCWLSRAKLINDDDV 

C ACAAC ATAATGCTTATTGCTGGCTATCGCGAGC AAAGCTG ATAAATGATGACGATGTGC 9 000 

HYNCRAYFNNKTNDAIGLDN 
ATTATAATTGTCGCGCATATTTTAACAATAAAACAAATGATGCGATTGGCTTAGATAATA 9060 

Start of orf 10 End of orf9 

MSDAPI lAVVMAGGTGS 
KDI ICLMRQ* 

AGGATATAATaTGTCTGATGCGCC AA TAATTGCTGTAGTTATGGCCGGTGGTAC AGGC AG 9120 

RLWPLSRELYPKQFLQLSGD 
TCGTCTTTGGCCACTTTCTCGTGAACTATATCCAAAGC AGTTTTTACAACTCTCTGGTGA 9180 

NTLLQTTLLRLSGLSCQKPL 
TAAC ACCTTGTTACAAACGACTTTGCTACGACTTTCAGGCCTATCATGTCAAAAACCATT 9240 

VITNEQHRFVVAEQLREINK 
AGTGATAACAAATGAACAGCATCGCTTTGTTGTGGCTGAACAGTTAAGGGAAATAAATAA 9300 

LNGNI ILEPCGRNTAPAIAI 
ATTAAATGGTAATATTATTCTAGAACCATGCGGGCGAAATACTGCACCAGCAATAGCGAT 93 60 

SAFHALKRNPQEDPLLLVLA 
ATCTGCGTTTCATGCGTTAAAACGTAATCCTCAGGAAGATCCATTGCTTCTAGTTCTTGC 9420 

ADHVIAKESVFCDAIKNATP 
GGCAGACCACGTTATAGCTAAAGAAAGTGTTTTCTGTGATGCTATTAAAAATGCAACTCC 94 8 0 

lANQGKlVTFGI IPEYAETG 
CATCGCTAATCAAGGTAAAATTGTAACGTTTGGAATTATACCAGAATATGCTGAAACTGG 954 0 

YGYIERGELSVPLQGHENTG 
TTATGGGTATATTGAGAGAGGTGAACTATCTGTACCGCTTCAAGGGCATGAAAATACTGG 960 0 

FYYVNKFVEKPNRETAELYM 
TTTTTATTATGTAAATAAGTTTGTCGAAAAGCCTAATCGTGAAACCGCAGAATTGTATAT 9660 



9720 
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E ELRKFRPDIYNVCEQVAS S 
TG AGG AATTGAGAAAATTTAGACCTGAC ATTTAC AATGTTTGTGAACAGGTTGCCTCATC 9780 

SYIDLDFIRLSKEQFQDCPA 
CTC ATACATTGATCTAGATTTTATTCGATTATCAAAAGAACAATTTCAAGATTGTCCTGC 9840 

ESIDFAVMEKTEKCVVCPVD 
TGAATCTATTGATTTTGCTGTAATGGAAAAAACAGAAAAATGTGTTGTATGCCCTGTTGA 9900 

IGWSDVGSWQSLWDISLKSK 
TATTGGTTGGAGTGACGTTGGATCTTGGCAATCGTTATGGGAC ATTAGTCTAAAATCGAA 9960 

TGDVCKGDILTYDTKNNYIY 
AACAGGAGATGTATGTAAAGGTGATATATTAACCTATG ATACTAAGAATAATTATATCTA 10020 

SESALVAAIGIEDMVIVQTK 
CTCTG AGTCAGCGTTGGTAGCCGCC ATTGGAATTG AAGATATGGTTATCGTGC AAACTAA 10080 

DAVLVSKKSDVQHVKKIVEM 
AG ATGCCGTTCTTGTGTCTAAAAAGAGTG ATGTAC AGCATGTAAAAAAAATAGTCGAAAT 10140 

LKLQQRTEYI SHREVFRPWG 
GCTTAAATTGCAGCAACGTACAGAGTATATTAGTCATCGTGAAGTTTTCCGACCATGGGG 10200 

KFDSIDQGERYKVK KIIVKP 
AAAATTTGATTCGATTGACCAAGGTGAGCGATACAAAGTCAAGAAAATTATTGTGAAACC 10260 

GEGLSLRMHHHRSEHWIVLS 
TGGTG AGGGGCTTTCTTTAAGGATGC ATCACCATCGTTCTGAACATTGGATCGTGCTTTC 10320 

GTAKVTLGDKTKLVTANESI 
TGGTAC AGCAAAAGTT^CCCTTGGCGATAAAACTAAACTAGTC ACCGCAAATGAATCGAT 10380 

YIPLGAAYSLENPGI IPLNL 
ATACATTCCCCTTGGCGCAGCGTATAGTCTTGAGAATCCGGGCATAATCCCTCTTAATCT 10440 

lEVSSGDYLGEDDI IRQKER 
TATTGAAGTCAGTTCAGGGGATTATTTGGGAGAGGATGATATTATAAGACAGAAAGAACG 10500 

End of orflO Start of orfll 

YKHED* MKSLTCFKAYDIR 
TTACAAACATGAAGAT TAACATATGAAATCTTTAACCTGCTTTAAAGCCTATGATATTCG 10560 

GKLGEELNEDIAWRIGRAYG 
CGGGAAATTAGGCGAAGAACTGAATGAAGATATTGCCTGGCGC ATTGGGCGTGCCTATGG 10620 

EFLKPKTIVLGGDVRLTSEA 
CGAATTTCTCAAACCGAAAACCATTGTTTTAGGCGGTGATGTCCGCCTCACCAGCGAAGC 10680 

LKLALAKGLQDAGVDVLDIG 
GTTAAAACTGGCGCTTGCGAAAGGTTTAC AGGATGCGGGCGTCG ATGTGCTGG ATATCGG 10740 

MSGTEEIYFATFHLGVDGGI 
TATGTCCGGC ACCGAAGAGATCTATTTCGCCACGTTCCATCTCGGAGTGGATGGCGGCAT 10800 

EVTASHNPMDYNGMKLVREG 
CGAAGTTACCGCCAGCCATAACCCGATGGATTACAACGGCATGAAGCTGGTGCGCGAAGG 10860 

ARPI SGDTGLRDVQRLAEAN 
GGCTCGCCCGATCAGCGGTGATACCGGACTGCGCGATGTCCAGCGTCTGGC AGAAGCCAA 1092 0 

DFPPVDETKRGRYQ QINLRD 
TGACTTCCCTCCTGTCGATGAAACCAAACGTGGTCGCTATCAGCAAATCAATCTGCGTG A 10980 
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AYVDHLFGYINVKNLTPLKL 
CGCTTACGTTGATC ACCTGTTCGGTTATATCAACGTC AAAAACCTC ACGCCGCTC AAGCT 11040 

VINSGNGAAGPVVDAIEARF 
GGTGATCAACTCCGGGAACGGCGC AGCGGGTCCGGTGGTGGACGCCATTGAAGCCCGATT 11100 

KALGAPVELIKVHNTPDGNF 
TAAAGCCCTCGGCGCACCGGTGGAATTAATC AAAGTAC ACAACACGCCGGACGGCAATTT 11160 

PNGI PNPLLPECRDDTRNAV 
CCCCAACGGTATTCCTAACCCGCTGCTGCCGGAATGCCGCGACGAC ACCCGTAATGCGGT 11220 

IKHGADMGIAFDGDFDRCFL 
CATCAAACACGGCGCGGATATGGGCATTGCCTTTGATGGCGATTTTGACCGCTGTTTCCT 11280 

FDEKGQFIEGYYIVGLLAEA 
GTTTGACGAAAAAGGGCAGTTTATCGAGGGCTACTACATTGTCGGCCTGCTGGCAGAAGC 11340 

FLEKNPGAKI IHDPRLSWNT 
GTTCCTCGAAAAAAATCCCGGCGCGAAGATCATCCACGATCCACGTCTCTCCTGGAACAC 11400 

VDVVTAAGGTPVMSKTGHAF 
CGTTGATGTGGTGACTGCCGCAGGCGGCACCCCGGTAATGTCGAAAACCGGACACGCCTT 11460 

IKERMRKEDAIYGGEMSAHH 
TATTAAAGAACGT ATGCGC AAGGAAG ACGCC ATCTACGGTGGCGAAATGAGCGCTC ACC A 11520 

YFRDFAYCDSGMI PWL LVAE 
TTACTTCCGTGATTTCGCTTACTGCGACAGCGGCATGATCCCGTGGCTGCTGGTCGCCGA 11580 

LVCLKGKTLGEMVRDRMAAF 
ACTGGTGTGCCTGAAAGGAAAAACGCTGGGCGAAATGGTGCGCGACCGGATGGCGGCGTT 11640 

PASGEINSKLAQPVEAINRV 
TCCGGCAAGCGGTGAGATCAACAGCAAACTGGCGCAACCCGTTGAGGCAATTAATCGCGT 11700 

EQHFSREALAV DRTDGISMT 
GGAACAGCATTTTAGCCGCGAGGCGCTGGCGGTGGATCGCACCGATGGCATCAGCATGAC 11760 

FADWRFNLRS SNTEPVVRLN 
CTTTGCCGACTGGCGCTTTAACCTGCGCTCCTCCAACACCGAACCGGTGGTGCGGTTGAA 11820 

VESRGDVKLMEKKTKALLKL 
TGTGGAATCACGCGGTGATGTAAAGCTAATGGAAAAGAAAACTAAAGCTCTTCTTAAATT 11880 

End o£ orf 11 

L S E * 

GCTAAGTGAG TGATTATTTACATTAATCATTAAGCGTATTTAAGATTATATTAAAGTAAT 11940 
GTTATTGCGGTATATGATGAATATGTGGGCTTTTTTATGTATAACGACTATACCGCAACT 12000 

Stiart o£ H-z-epeat: 

TTATCTAGG AAAAGATTAATAGAAATAAAGTTTTGTACTGACC AATTTGCATTTCACGTC 12060 

ACGATTGAGACGTTCCTTTGCTTAAGACATTTTTTCATCGCTTATGTAATAAC AAATGTG 12120 

CCTTATATAAAAAGGAG AACAAAATGGAACTTAAAATAATTGAGACAATAGATTTTTATT 12180 

ATCCCTGTTTACGATATTATAGCCAAAGTTGTATCCTGCATCAGTCCTGCAATATTTCAC 12240 

GAGTGCTTTGTTAACTGAATACATGTCTGCC ATTTTCC AGATG ATAACGACGTCATCGC A 123 00 

ATTG ATGGT AAAAC ACTTCGGCAC ACTTATG AC AAG AGTCGTCGCAGAGG AGTGGTTCAT 123 60 
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GTCATTAGTGCGTTTCAGCAATGCACAGTCTGGTCCTCGGATAGATCAAGACGGATGAGA 
AACCTAATGCGTTCACAGTTATTCATGAACTTTCTAAAATGATGGGTATTAAAGGAAAAA 
TAATCATAACTGATGCGATGGCTTGCCAGAAAGATATTGCAGAGAAGATATAAAAACAGA 
GATGTGATTATTTATTCGCTGTAAAAGGAAATAAGAGTCGGCTTAATAGAGTCTTTGAGG 
AGATATTTACGCTGAAAGAATTAAATAATCCAAAACATGACAGTTACGCAATTAGTGAAA 
AGAGGCACGGCAGAGACGATGTCCGTCTTCATATTGTTTGAGATGCTCCTGATGAGCTTA 
TTGATTTCACGTTTGAATGGAAAGGGCTGCAGAATTTATGAATGGCAGTCCACTTTCTCT 
CAATAATAGCAGAGCAAAAGAAAGAATCCGAAATGACGATCAAATATTATATTAGATCTG 
CTGCTTTAACCGCAGAGAAGTTCGCCACAGTAAATCGAAATCACTGGCGCATGGAGAATA 
AGTTGCACAGTAGCCTGATGTGGTAATGAATGAAATCGACTATAATATAAGAAGGCGAGT 
TGCATTCGAATGATTTTCTAGAATGCGGCACATCGCTATTAATATCTGACAATGATAATG 
TATTCAAGGCAGGATTATCATGTAAGATGCGAAAAGCAGTCATGGACAGAAACTTCCTAG 

cgtcaggcattgcagcgtgcgggctttcataatcttgcatt^tttgatL^ga^^ 



Start: of orfl2 

mnlygifgagsygre 
tttggagatgggaaaatsaatttgtatggtatttttggtgctggaagttatggtagagaa 



tipilnqqikqecgsdyalv 
acaatacccattctaaatcaacaaataaagcaagaatgtggttctgactatgctctggtt 

fvddvlagkkvngfevlstn 
tttgtggatgatgttttggcaggaaagaaagttaatggttttgaagtgctttcaaccaac 

rSr.Srr. ^^^PYLKKYFNVAIANDK 

tgctttctaaaagccccttatttaaaaaagtattttaatgttgctattgctaatgataag 

IRQRVSESILLHGVEPITIK 

atacgacagagagtgtctgagtcaatattattacacggggttgaaccaataactataaaa 

HPNSVVYDHTMIGSGAI ISP 

catccaaatagcgttgtttatgatcatactatgataggtagtggcgctattatttctccc 

mL^L '^^stnthigrffhaniys 
tttgttacaatatctactaatactcatatagggaggttttttcatgcaaacatatactca 

yvahdcqigdyvtfapgakc 
tacgttgcacatgattgtcaaataggagactatgttacatttgctcctggggctaaatgt 

NGYVVIEDNAYIGSGAVIKO 

aatggatatgttgttattgaagacaatgcatatataggctcgggtgcagtaattaagcag 

GVPNRPLI IGAGAI IGMGAV 

ggtgttcctaatcgcccacttattattggcgcgggagccattataggtatgggggctgtt 

VTKSVPAGITVCGNPAREMK 

gtcactaaaagtgttcctgccggtataactgtgtgcggaaatccagcaagagaaatgaaa 

End of orf 12 

R S P T S I * 

agatcgccaacatctatttaatgggaatgcgaaaacacgttccaaatgggactaatgttt 



12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 

13140 

13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 

13860 
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AAAATATATATAATTTCGCTAATTTACTAAATTATGGCTTCTTTTTAAGCTATCCTTTAC 1 3 

TTAGTTATTACTGATACAGCATGAAATTTATAATACTCTGATACATTTTTATACGTTATT 1 3 
CAAGCCGCATATCTAGCGGTAACCCCTG ACAGGAGTAAACAATG 14024 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAATAATATCAACAAG 

AACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTTGCGTATTAACAGC 

GCGAAGGATGACGCCGCGGGTCAGGCGATTGCTAACCGTTTTACTTCTAACATTAAAGGC 

CTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGTTGCACAGACCACTGAAGGC 

GCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTGAGCTGACGGTTCAGGCTTCT 

ACCGGGACTAACTCTGATTCGGATCTGGACTCCATTCAGGACGAAATCAAATCCCGTCTC 

GACGAAATTGACCGCGTATCCGGTCAGACCCAGTTCAACGGCGTGAACGTACTGGCAAAA 

GACGGTTCGATGAAAATTCAGGTAGGTGCGAACGACGGCCAGACTATCACTATTGATCTG 

AAGAAAATTGACTCTGATACGCTGGGGCTGAATGGTTTTAACGTGAATGGTTCCGGTACG 

ATAGCCAATAAAGCGGCGACCATTAGCGACCTGACAGCAGCGAAAATGGATGCTGCAACT 

AATACTATAACTACAACAAATAATGCGCTGACTGCATCAAAGGCCCTTGATCAACTGAAA 

GATGGTGACACTGTTACTATCAAAGCAGATGCAGCTCAAACTGCCACGGTCTATACATAC 

AATGCATCTGCTGGTAACTTCTCATTCAGTAATGTATCGAATAATACTTCAGCAAAAGCA 

GGTGATGTAGCAGCTAGCCTTCTCCCGCCGGCTGGGCAAACTGCTAGTGGTGTTTACAAA 

GCAGCAAGCGGTGAAGTGAACTTTGATGTTGATGCGAATGGTAAAATTACAATCGGAGGA 

CAGGAAGCCTATTTAACTAGTGATGGTAACTTAACTACAAACGATGCTGGTGGTGCGACT 

GCGGCTACGCTTGATGGTTTATTCT^GAAAGCTGGTGATGGTCTUVTCAATCGGGTTTAAT 

AAGACTGCATCAGTCACGATGGGGGGAACAACTTATAACTTTAAAACGGGTGCTGATGCT 

GGTGCTGCAACTGCTAACGCAGGGGTATCGTTCACTGATACAGCTAGCAAAGAAACCGTT 

TTAAATAAAGTGGCTACAGCTAAACAAGGCACAGCAGTTGCAGCTAACGGTGATACATCC 

GCAACAATTACCTATAAATCTGGCGTTCAGACGTATCAGGCGGTATTTGCCGCAGGTGAC 

GGTACTGCTAGCGCAAAATATGCCGATAATACTGACGTTTCTAATGCAACAGCAACATAC 

ACAGATGCTGATGGTGAAATGACTACAATTGGTTCATACACCACGAAGTATTCAATCGAT 

GCTAACAACGGCAAGGTAACTGTTGATTCTGGAACTGGTTCGGGTAAATATGCGCCGAAA 

GTCGGGGCTGAAGTATATGTTAGTGCTAATGGTACTTTAACAACAGATGCAACTAGCGAA 

GGCACAGTAACAAAAGATCCACTGAAAGCTCTGGATGAAGCTATCAGCTCCATCGACAAA 

TTCCGTTCATCCCTGGGGGCTATCCAAAACCGTTTGGATTCCGCCGTCACCAACCTGAAC 

AACACCACTACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACC 

GAAGTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGCA 

AAAGCCAACCAGGTACCGCAGCAGGTTCTGTCTCTACTGCAGGGTTAA 



Figure 7 
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AACAAATCTCAGTCTTCTCTTAGCTCTGCTATT 

GAGCGTCTGTCTTCTGGTCTGCGTATTAACAGCGCAAAAGACGATGCAGCAGGTCAGGCG 
ATTGCTAACCGTTTTACGGCAAATATTAAAGGTCTGACCCAGGCTTCCCGTAACGCAAAT 
GATGGTATTTCTGTTGCGCAGACCACTGAAGGTGCGCTGAATGAAATTAACAACAACCTG 
CAGCGTATTCGTGAACTTTCTGTTCAGGCAACTAACGGTACTAACTCTGACAGTGACCTG 
ACCTCCATCCAGTCCGAAATCCAGCAGCGTCTGAGTGAAATTGACCGTGTTTCTGGTCAG 
ACTCAGTTTAACGGCGTTAAAGTGCTGGCTTCTGATCAGGATATGACTATTCAGGTTGGT 
GCAAACGACGGCGAAACAATTACTATTAAACTGCAGGAAATTAATTCCGACACACTGGGA 
TTATCTGGTTTTGGTATTAAAGATCCTACTAAATTAAAAGCCGCAACGGCTGAAACAACC 
TATTTTGGATCGACAGTTAAGCTTGCTGACGCTAATACACTTGATGCAGATATTACAGCT 
ACAGTTAAAGGCACTACGACTCCGGGCCAACGTGACGGTAATATTATGTCTGATGCTAAC 
GGTAAGTTGTACGTTAAAGTTGCCGGTTCAGATAJU^CCCGCTGAAAATGGTTATTATGAA 
GTTACTGTGGAGGATGATCCGACATCTCCTGATGCAGGTAAGCTGAAGCTGGGGGCTCTA 
GCGGGTACCCAGCCTCAAGCTGGTAATTTAAAGGAAGTCACAACGGTGAAAGGGAAGGGG 
GCTATTGATGTTCAGTTGGGTACTGATACCGCAACCGCTTCTATCACAGGTGCAAAACTC 
TTTAAGTTAGAAGACGCCAATGGCAAAGATACTGGTTCATTTGCGTTGATTGGTGATGAC 
GGTAAACAGTATGCAGCGAATGTTGATCAGAAAACAGGAGCAGTTTCCGTTAAAACAATG 
TCTTACACTGATGCTGACGGTGTCAAACACGACAATGTTAAAGTTGAACTGGGTGGAAGC 
GATGGCAAAACCGAAGTTGTAACTGCAACCGATGGCAAAACTTACAGTGTTAGTGATTTA 
CAAGGTAAGAGCCTGAAAACTGATTCTATTGCAGCAATTTCTACGCAGAAAACAGAAGAT 
CCTTTGGCTGCTATCGATT^AAGCACTGTCTCAGGTTGACTCGTTGCGTTCTAACCTAGGT 
GCAATTCAAAATCGTTTCGACTCTGCCATCACCAACCTTGGCAACACCGTAAACAACCTG 
TCTTCTGCCCGTAGCCGTATCGAAGATGCTGACTACGCGACCGAAGTGTCTAACATGTCT 
CGTGCGCAGATCCTGCAACAAGCGGGTACCTCTGTTCTGGCGCAG 



Figure 8 
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AACAAATCTCAGTCTTCTCTGAGCTCCGCCATTGAACGTCTCTCTTC 

TGGCCTGCGTATTAACAGTGCTAAAGATGACGCAGCAGGTCAGGCGATTGCTAACCGTTT 

TACAGCAAATATTAAAGGTCTGACTCAGGCTTCCCGTAACGCGAATGATGGTATTTCTGT 

TGCGCAGACCACTGAAGGTGCGCTTTCTGAAATCAACAATAACTTACAGCGTATTCGTGA 

ATTGTCAGTACAGGCCACTAATGGTAC;U\ACTCTGACTCCGACCTGAATTCAATTCAGGA 

TGAAATTACACAACGCCTTAGTGAAATTGATCGTGTTTCTAACCAGACACAATTTAATGG 

TGTAAAAGTTCTGGCTTCTGATCAGACTATGAAAATTCAAGTAGGTGCGAACGATGGTGA 

AACCATTGAGATTGCCCTTGATAAAATTGATGCTAAAACCTTGGGGCTTGATAACTTTAG 

CGTAGCACCAGGAAAAGTTCCAATGTCCTCTGCGGTTGCACTTAAGAGCGAAGCCGCTCC 

TGACTTAACTAAGGTAAATGCAACTGATGGTAGTGTGGGAGGTGCTAAAGCATTCGGTAG 

CAATTATAAAAATGCTGATGTTGAAACTTATTTTGGTACCGGTAATGTACAAGATACAAA 

GGATACAACTGATGCGACCGGTACTGCAGGAACAAAAGTTTATCAAGTACAGGTGGAAGG 

GCAGACTTATTTTGTTGGTCAAGATAATAATACCAACACGAACGGTTTTACATTATTGAA 

ACAAAACTCTACAGGTTATGAAAAAGTTCAGGTGGGTGGTAAGGATGTTCAGTTAGCAAA 

CTTTGGTGGTCGTGTAACTGCATTTGTTGAAGATAATGGTTCTGCCACATCAGTTGATTT 

AGCTGCGGGTAAAATGGGTAAAGCATTAGCTTATAATGATGCACCAATGTCTGTTTATTT 

TGGGGGAAAAAACCTAGATGTCCACCAAGTACAAGATACCCAAGGGAATCCTGTACCTAA 

TTCATTTGCTGCTAAAACATCAGACGGCACCTACATTGCAGTAAATGTAGATGCCGCTAC 

AGGTAACACGTCTGTTATTACTGATCCTAATGGTAAGGCAGTTGAATGGGCAGTAAAAAA 

TGATGGTTCTGCACAGGCAATTATGCGTGAAGATGATAAGGTTTATACAGCCAATATCAC 

GAATAAGACGGCAACCAAAGGTGCTGAACTCAGTGCCTCAGATTTGAAAGCCTTAGCAAC 

CACAAATCCATTATCCACATTAGACGAAGCTTTGGCAAAAGTTGATAAGTTGCGCAGTTC 

TTTGGGTGCAGTACAAAACCGTTTCGACTCTGCCATCACCAACCTTGGCAACACCGTAAA 

CAACCTGTCTTCTGCCCGTAGCCGTATAGAAGATGCTGACTACGCAACCGAAGTGTCTAA 

CATGTCTCGTGCGCAGATCCTGCAACAAGCGGGTACCTCTGTTCTGGCACAG 



Figure 9 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTATCGAG 

CGCCTCTCTTCTGGTCTGCGTATTAACAGCGCTAAAGATGACGCCGCGGGCCAGGCGATT 
GCTAACCGCTTTACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAACGAC 
GGTATTTCTCTGGCGCAGACGGCTGAAGGCGCGCTGTCAGAGATTAACAACAACTTGCAG 
CGTATTCGTGAACTGACCGTTCAGGCCTCTACCGGCACGAACTCTGATTCCGACCTGTCT 
TCTATTCAGGACGAAATCAAATCCCGTCTTGATGAAATTGACCGTGTATCTGGTCAGACC 
CAGTTCAACGGTGTGAACGTGCTGTCGAAAAACGATTCGATGAAGATTCAGATTGGTGCC 
AATGATAACCAGACGATCAGCATTGGCTTGCAACAAATCGACAGTACCACTTTGAATCTG 
AAAGGATTTACCGTGTCCGGCATGGCGGATTTCAGCGCGGCGAAACTGACGGCTGCTGAT 
GGTACAGCT^TTGCTGCTGCGGATGTCAAGGATGCTGGGGGTAAACAAGTCTVATTTACTG 
TCTTACACTGACACCGCGTCTAACAGTACTAAATATGCGGTCGTTGATTCTGCAACCGGT 
AAATACATGGAAGCCACTGTAGTCATTACCGGTACGGCGGCGGCGGTAACTGTTGGTGCA 
GCGGAAGTGGCGGGAGCCGCTACAGCCGATCCGTTAAAAGCACTGGATGCCGCAATCGCT 
AAAGTCGACAAATTCCGCTCCTCCCTCGGTGCCGTTCAAAACCGTCTGGATTCTGCGGTC 
ACCAACCTGAACAACACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCC 

GACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCGGGCAAC TCCGTGCTGTCTAA 



Figure 10 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTAT 

CGAGCGCCTCTCTTCTGGTCTGCGTATTAACAGCGCTAAAGATGACGCCGCGGGCCAGGC 
GATTGCTAACCGCTTCACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAA 
CGACGGTATCTCTCTGGCGCAGACCACTGAAGGCGCGCTGTCTGAAATCAACAACAACTT 
GCAGCGTGTGCGTGAGTTGACCGTTCAGGCGACGACCGGGACTAACTCTGATTCTGACCT 
GTCTTCTATTCAGGACGAAATCAAATCCCGTCTGGATGAAATTGATCGCGTTTCCGGTCA 
GACCCAGTTCAACGGCGTGAATGTGCTGGCGAAAGATGGTTCGATGAAGATTCAGGTTGG 
CGCGAATGATGGGCAGACTATTAGCATTGATTTGCAGAAGATTGACTCTTCTACATTAGG 
ACTGAACGGTTTCTCCGTTTCGGGTCAGTCACTTAACGTTAGTGATTCCATTACTCAAAT 
TACCGGTGCCGCCGGGACAAAACCTGTTGGTGTTGATTTCACTGCTGTTGCGAAAGATCT 

gactactgcgacaggtaaaacagtcgatgtttctagcctgacgttacacaacactctgga 
tgcga;^ggggctgctacatcacagttcgtcgttcaatccggcaatgatttctactccgc 
gtcgattaatcatacagacggcaaagtcacgttgaataaagccgatgtcgaatacacaga 
caccgataatggactaacgactgcggctactcagaaagatcaactgattaaagttgccgc 
tgactctgacggctcggctgcgggatatgtaacattccaaggtaaaaactacgctacaac 
ggtttcaacggcacttgatgataatactgcggcaaaagcaacagataataaagttgttgt 
tgaattatcaacagcaaaaccgactgcacagttctcaggggcttcttctgctgatccact 
ggcacttttagacaaagctattgcacaggttgatactttccgctcctccctcggtgcggt 
gcaaaaccgtctggattccgcagtaaccaacctgaacaacaccaccaccaacctgtctga 
agcgcagtcccgtattcaggacgccgactatgctacagaagtgtccaacatgtcgaaagc 
gcagatcatccagcaggcaggtaactcggtgctgtccaaa 



Figure 11 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGC 

TGATCACTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTC 
TGTCTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCGGGTCAGGCGATTGCTA 
ACCGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTA 
TTTCTGTTGCGCAGACCACCGAAGGCGCGCTGTCCGAAATTAACAACAACTTACAGCGTA 
TTCGTGAACTGACGGTTCAGGCTTCTACCGGGACTAACTCTGATTCGGATCTGGACTCCA 
TTCAGGACGAAATCAAATCCCGTCTCGACGAAATTGACCGCGTATCCGGTCAGACCCAGT 
TCAACGGCGTGAACGTACTGGCAAAAGACGGTTCGATGAAAATTCAGGTTGGTGCGAATG 
ACGGCCAGACTATCACTATTGATCTGAAGAAAATTGACTCTGATACGCTGGGGCTGAATG 
GGTTTAATGTGTIACGGCAAAGGGGAAACGGCTAATACGGCAGCAACCCTGAAAGATATGT 
CTGGATTCACAGCTGCGGCGGCACCAGGGGGAACTGTTGGTGTAACTCAATATACTGACA 
AATCGGCTGTAGCAAGTAGCGTAGATATTCTAAATGCTGTTGCTGGCGCAGATGGAAATA 
AAGTTACAACTAGCGCCGATGTTGGTTTTGGTACACCAGCCGCTGCTGTAACCTATACCT 
ACAATAAAGACACTAATTCATATTCCGCCGCTTCTGATGATATTTCCAGCGCTAACCTGG 
CTGCTTTCCTCAATCCTCAGGCCGGAGATACGACTAAAGCTACAGTTACAATTGGTGGCA 
AAGATCAAGATGTAAACATCGATAAATCCGGTAATTTAACTGCTGCTGATGATGGCGCAG 
TACTTTATATGGATGCTACCGGTAACTTAACTAAAAATAATGCTGGTGGTGATACACAAG 
CTACTTTGGCTAAACTTGCTACTGCTACTGGTGCTAAAGCCGCGACCATCCAAACTGATA 
AAGGAACATTCACCAGTGACGGTACAGCGTTTGATGGTGCATCAATGTCCATTGATACCA 
ATACATTTGCAAATGCAGTAAAAAATGACACTTATACTGCCACTGTAGGTGCTAAGACTT 
ATAGCGTAACAACAGGTTCTGCTGCTGCAGACACCGCTTATATGAGCAATGGGGTTCTCA 
GTGATACTCCGCCAACTTACTATGCACAAGCTGATGGAAGTATCACAACTACTGAGGATG 
CGGCTGCCGGTAAACTGGTCTACAAAGGTTCCGATGGTAAGTTAACAACGGATACGACTA 
GCAAAGCAGAATCAACATCAGATCCGCTGGCAGCTCTTGACGACGCTATCAGCCAGATCG 
ACAAATTCCGCTCCTCCCTGGGTGCGGTGCAAAACCGTCTGGATTCCGCAGTGACCAACC 
TGAACAACACCACTACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATG 
CGACCGAAGTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGC 
TGGCAAAAGCTAACCAGGTTCCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCACAGAC CACCGAAGGC 
GCGCTGTCTG AAATCAACAA CAACTTACAG CGTATCCGTG AGCTGACGGT TCAGGCTTCT 
ACCGGAACTA ACTCTGATTC GGATCTGGAC TCCATTCAGG ACGAAATCAA ATCCCGTCTT 
GATGAAATTG ACCGCGTATC CGGCCAGACC CAGTTCAACG GCGTGAACGT ACTGGCAAAA 
GACGGTTCGA TGAAAATTCA GGTTGGTGCG AATGACGGTG AAACTATCAC TATCGACCTG 
AAGAAAATCG ATTCTGATAC TCTGGGTCTG AATGGTTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATAGCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC CGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACCAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGG 
GCAGGCTACT GATTCAGCTA AAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TTAAAGCCGC 
GAGCGAAGGT AGTGACGGTG CTTCTCTGAC ATTCAATGGC ACTGAATATA CTATCGCAAA 
AGCAACTCCT GCGACAACCT CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATT ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCTATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CTGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
TATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCCAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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AACAAATCTCAGTCTTCTCTTAGCTCTGCTA 

TTGAGCGTCTGTCTTCTGGTCTGCGTATTAACAGCGCAAAAGACGATGCAGCAGGTCAGG 

CGATTGCTAACCGTTTTACGGCAAATATTAAAGGTCTGACCCAGGCTTCCCGTAACGC7VA 

ATGATGGTATTTCTGTTGCGCAGACCACTGAAGGTGCGCTGAATGAAATTAACAACAACC 

TGCAGCGTATTCGTGAACTTTCTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATC 

TTTCTTCTATCCAGGCTGAAATTACTCAACGTCTGGAAGAAATTGACCGTGTATCTGAGC 

AAACTCAGTTTAACGGCGTGAAAGTCCTTGCTGAAAATAATGAAATGAAAATTCAGGTTG 

GTGCTAATGATGGTGAAACCATCACTATCTU^TCTGGCAAAAATTGATGCGAAAACTCTCG 

GCCTGGACGGTTTTAATATCGATGGCGCGCAGAAAGCAACAGGCAGTGACCTGATTTCTA 

AATTTAAAGCGACAGGTACTGATAATTATGATGTTGGCGGTAAAACTTATACCGTGAATG 

TGGAGAGCGGCGCGGTTAAGAATGATGCTAATAAAGATGTTTTTGTAAGCGCAGCTGATG 

GATCGCTGACGACCAGTAGTGATACTAAAGTATCCGGTGAAAGTATTGATGCAACAGAAC 

TAGCGAAACTTGCAATAAAATTAGCTGACAAAGGCTCCATTGAATACAAGGGCATTACAT 

TTACTAACAACACTGGCGCAGAGCTTGATGCTAATGGTAAAGGTGTTTTGACCGCAAATA 

TTGATGGTCAAGATGTTCAATTTACTATTGACAGTAATGCACCCACGGGTGCCGGCGCAA 

CAATAACTACAGACACAGCTGTTTACAAAAACAGTGCGGGCCAGTTCACCACTACAAAAG 

TGGAA7VATAAAGCCGCAACACTCTCTGATCTGGATCTTAATGCAGCCAAGAAAACAGGTA 

GCACTTTAGTTGTAAATGGCGCCACCTACAATGTCAGCGCAGATGGTAAAACGGTAACTG 

ATACTACTCCTGGTGCCCCTAAAGTGATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGA 

TTCTGGTAAACGAAGATGCAGCAAAATCGTTGCAATCTACCACCAACCCGCTCGAAACTA 

TCGACAAGGCATTGGCTAAAGTTGACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACC 

GTTTCGACTCTGCCATCACCAACCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTA 

GCCGTATCGAAGATGCTGACTACGCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCC 

TGCAACAAGCGGGTACCTCTGTTCTGGCGCAG 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCG 

CTGATCACTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGT 

CTGTCTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCGGGTCAGGCGATTGCT 

AACCGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGT 

ATTTCCGTTGCACAGACCACTGAAGGCGCGCTGTCCGAAATTAACAACAACTTACAGCGT 

ATTCGTGAACTGACGGTTCAGGCTTCTACCGGGACTAACTCCGATTCGGATCTGGACTCC 

ATTCAGGACGAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGCCAGACCCAG 

TTCAACGGCGTGAACGTGCTGTCCAAAGATGGCTCGATGAAAATTCAGGTCGGCGCGAAC 

GATGGCGAAACGATTACTATTGATCTGAAGAAAATTGACTCTGATACGCTGAATCTGGCT 

GGTTTTAACGTTAACGGTAAAGGTTCTGTAGCGAATACAGCTGCGACAAGCGACGATTTA 

AAACTGGCTGGTTTCACTAAGGGCACCACAGATACCAATGGCGTGACCGCGTATACAAAC 

ACAATTAGTAATGACAAAGCCAAAGCTTCCGATCTGTTAGCTAATATCACCGATGGATCA 

GTGATCACTGGGGGAGGGGCAAACGCTTTTGGCGTGGCTGCAAAGAATGGTTACACCTAT 

GATGCAGCAAGTAAATCTTATAGTTTTGCTGCAGATGGTGCCGATTCAGCGAAGACGTTA 

AGCATCATTAATCCAAACACCGGTGATTCGTCGCAGGCGACAGTGACTATTGGTGGTAAA 

GAGCAGAAAGTTAATATTTCCCAGGATGGAAAAATTACTGCGGCAGATGATAATGCGACG 

CTGTATTTAGATAAACAGGGAAACTTGACAAAAACGAAtGCAGGTAACGATACCGCAGCG 

ACTTGGGATGGTTTAATTTCCAACAGCGATTCTACCGGTGCGGTTCCAGTTGGGGTTGCA 

ACTACAATTACAATTACTTCTGGTACAGCTTCCGGAATGTCTGTTCAGTCCGCAGGAGCA 

GGAATTCAGACCTCAACAAATTCTCAGATTCTTGCAGGTGGTGCATTTGCGGCTAAGGTA 

AGTATTGAGGGAGGCGCTGCTACAGACATTTTGGTAGCAAGTAATGGAAACATAACAGCG 

GCTGATGGTAGTGCACTTTATCTTGATGCGACTACTGGTGGATTCACTACAACGGCTGGA 

GGAAATACAGCTGCTTCGTTAGATAATTTAATTGCTAACAGTAAGGATGCTACCTTAACC 

GTAACTTCAGGTACCGGCCAGAACACTGTTTATAGCACAACAGGAAGTGGCGCTCAGTTC 

ACCAGTTTAGCAAAAGTAGACACAGTCAATGTCACCAACGCACATGTCAGTGCCGAAGGT 

ATGGCAAATCTGACAAAAAGCAATTTTACCATTGATATGGGCGGTACAGGTACAGTAACT 

TACACAGTTTCCAATGGGGATGTGAAAGCTGCTGCAAATGCTGATGTTTATGTCGAAGAT 

GGTGCACTTTCAGCCAATGCTACAAAAGATGTAACCTACTTTGAACAAAAAAATGGGGCT 

ATTACCAACAGCACCGGTGGTACCATCTATGAAACAGCTGATGGTAAGTTAACAACAGAA 

GCTACTACTGCATCCAGTTCCACCGCCGATCCCCTGAAAGCTCTGGACGAAGCCATCAGC 

TCCATCGACAAATTCCGCTCCTCCCTCGGTGCGGTGCAAAACCGTCTGGATTCCGCGGTC 

ACCAACCTGAACAACACCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGCC 

GACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATCCAGCAGGCCGGTAAC 

TCCGTGCTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 



Figure 15 




wo 99/61458 



PCT/AU99/00385 



38/96 



ATGGCACAAGTCATTAATACCAACAGC 

CTCTCGCTGATCACTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATC 

GAGCGTCTGTCTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCGGGTCAGGCG 

ATTGCTAACCGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAAC 

GACGGTATTTCTGTTGCGCAGACCACCGAAGGCGCGCTGTCCGAAATTAACAACAACTTA 

CAGCGTGTGCGTGAGCTGACTGTTCAGGCGACCACCGGTACTAACTCTGAGTCTGACCTG 

TCTTCTATCCAGGACGAAATCAAATCTCGCCTGGAAGAGATTGATCGTGTTTCAAGTCAG 

ACTCAATTTAACGGCGTGAATGTTTTGGCTAAAGATGGGAAAATGAACATTCAGGTTGGG 

GCAAATGATGGACAGACTATCACTATTGATCTGAAAAAGATCGATTCATCTACACT7VAAC 

CTCTCCAGTTTTGATGCTACAAACTTGGGCACCAGTGTTAAAGATGGGGCCACCATCAAT 

AAGCAAGTGGCAGTAGGTGCTGGCGACTTTAAAGATAAAGCTTCAGGATCGTTAGGTACC 

CTAAAATTAGTTGAGAAAGACGGTAAGTACTATGTAAATGACACTAAAAGTAGTAAGTAC 

TACGATGCCGAAGTAGATACTAGTAAGGGTAAAATTAACTTCAACTCTACAAATGAAAGT 

GGAACTACTCCTACTGCAGCGACGGAAGTAACTACTGTTGGCCGCGATGTAAAATTGGAT 

GCTTCTGCACTTAAAGCCAACCAATCGCTTGTCGTGTATAAAGATAAAAGCGGCAATGAT 

GCTTATATCATTCAGACCAAAGATGTAACAACTAATCAATCAACTTTCAATGCCGCTAAT 

ATCAGTGATGCTGGTGTTTTATCTATTGGTGCATCTACAACCGCGCCAAGCAATTTAACA 

GCTAACCCGCTTAAGGCTCTTGATGATGCAATTGCATCTGTTGATAAATTCCGCTCTTCT 

CTCGGTGCCGTTCAGAACCGTCTGGATTCTGCCATTGCCAACCTGAACAACACCACTACC 

AACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCTGACTATGCGACCGAAGTGTCCAAC 

ATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCCAACCAG 

GTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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AACAAATCTCAGTCTTCTCTGAGCTCCGCCAT 

TGAACGTCTCTCTTCTGGCCTGCGTATTAACAGTGCTAAAGATGACGCAGCAGGTCAGGC 

GATTGCTAACCGTTTTACAGCAAATATTAAAGGTCTGACTCAGGCTTCCCGTAACGCGAA 

TGATGGTATTTCTGTTGCGCAGACCACTGAAGGTGCGCTGAATGAAATTAACAACAACCT 

GCAGCGTGTACGTGAACTGACTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATCT 

TTCTTCTATCCAGGCTGAAATTACTCAACGTCTGGAAGAAATTGACCGTGTATCTGAGCA 

AACTCAGTTTAACGGCGTGAAAGTCCTTGCTGAAAATAATGAAATGAAAATTCAGGTTGG 

TGCTAATGATGGTGAAACCATCACTATCAATCTGGCAA7VAATTGATGCGAAAACTCTCGG 

CCTGGACGGTTTTAATATCGATGGCGCGCAGAAAGCAACTGGCAGTGACCTGATTTCTAA 

ATTTAAAGCGACAGGTACTGATAACTATGATGTTGGCGGTGATGCTTATACTGTTAACGT 

AGATAGCGGAGCTGTTAAAGATACTACAGGGAATGATATTTTTGTTAGTGCAGCAGATGG 

TTCACTGACAACTAAATCTGACACAAACATAGCTGGTACAGGGATTGATGCTACAGCACT 

CGCAGCAGCGGCTAAGAATAAAGCACAGAATGATAAATTCACGTTTAATGGAGTTGAATT 

CACAACAACAACTGCAGCGGATGGCAATGGGAATGGTGTATATTCTGCAGAAATTGATGG 

TAAGTCAGTGACATTTACTGTGACAGATGCTGACAAAAAAGCTTCTTTGATTACGAGTGA 

GACAGTTTACAAAAATAGCGCTGGCCTTTATACGACAACCAAAGTTGATAACAAGGCTGC 

CACACTTTCCGATCTTGATCTCAATGCAGCTAAGAAAACAGGAAGCACGTTAGTTGTTAA 

CGGTGCAACTTACGATGTTAGTGCAGATGGTAAAACGATAACGGAGACTGCTTCTGGTAA 

CAATAAAGTCATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGATTCTGGTAAACGAAGA 

TGCAGCAAAATCGTTGCAATCTACCACCAACCCGCTCGAAACTATCGACAAAGCATTGGC 

TAAAGTTGACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGACTCTGCTAT 

CACCAACCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGCCGTATCGAAGATGC 

TGACTACGCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCCTGCAACAAGCGGGTAC 

CTCTGTTCTGGCGCAG 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCA 

CTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTT 

CTGGCTTGCGTATTAACAGCGCGAAGGATGACGCAGCGGGTCAGGCGATTGCTAACCGTT 

TCACCTCTAACATTAAAGGCCTGACTCAGGCGGCCCGTAACGCCAACGACGGTATCTCCG 

TTGCGCAGACCACCGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTG 

AACTGACGGTTCAGGCTTCTACCGGGACTAACTCCGATTCGGATCTGGACTCCATTCAGG 

ACGAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCTGGCCAGACCCAGTTCAACG 

GCGTGAACGTACTGGCGAAAGACGGTTCAATGAAAATTCAGGTTGGTGCGAATGACGGCC 

AGACTATCACGATTGATCTGAAGAAAATTGACTCAGATACGCTGGGGCTGAATGGTTTTA 

ACGTGAATGGTTCCGGTACGATAGCCAATAAAGCGGCGACCATTAGCGACCTGACAGCAG 

CGAAAATGGATGCTGCAACTAATACTATAACTACAACAAATAATGCGCTGACTGCATCAA 

AGGCGCTTGATCAACTGAAAGATGGTGACACTGTTACTATCAAAGCAGATGCTGCTCAAA 

CTGCCACGGTTTATACATACAATGCATCAGCTGGTAACTTCTCATTCAGTAATGTATCGA 

ATAATACTTCAGCAAAAGCAGGTGATGTAGCAGCTAGCCTTCTCCCGCCGGCTGGGCAAA 

CTGCTAGTGGTGTTTATAAAGCAGCAAGCGGTGAAGTGAACTTTGATGTTGATGCGAATG 

GTAAAATCACAATCGGAGGACAGAAAGCATATTTAACTAGTGATGGTAACTTAACTACAA 

ACGATGCTGGTGGTGCGACTGCGGCTACGCTTGATGGTTTATTCAAGAAAGCTGGTGATG 

GTCAATCAATCGGGTTTAAGAAGACTGCATCAGTCACGATGGGGGGAACAACTTATAACT 

TT7UVAACGGGTGCTGATGCTGATGCTGCAACTGCTAACGCAGGGGTATCGTTCACTGATA 

CAGCTAGCAAAGAAACCGTTTTAAATAAAGTGGCTACAGCTAAACAAGGCAAAGCAGTTG 

CAGCTGACGGTGATACATCCGCAACAATTACCTATAAATCTGGCGTTCAGACGTATCAGG 

CTGTATTTGCCGCAGGTGACGGTACTGCTAGCGCAAAATATGCCGATAAAGCTGACGTTT 

CTAATGCAACAGCAACATACACTGATGCTGATGGTGAAATGACTACAATTGGTTCATACA 

CCACGAAGTATTCAATCGATGCTAACAACGGCAAGGTAACTGTTGATTCTGGAACTGGTA 

CGGGTAAATATGCGCCGAAAGTAGGGGCTGAAGTATATGTTAGTGCTAATGGTACTTTAA 

CAACAGATGCAACTAGCGAAGGCACAGTAACAAAAGATCCACTGAAAGCTCTGGATGAAG 

CTATCAGCTCCATCGACAAATTCCGTTCTTCCCTGGGTGCTATCCAGAACCGTCTGGATT 

CCGCAGTCACCAACCTGAACAACACCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTC 

AGGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATTCAGCAGG 

CCGGTAACTCCGTGCTGGCAAAAGCCAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGC AGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAATA 

ATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTTGC 

GTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTTACTTCTA 

ACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCCGTTGCGCAGA 

CCACTGAAGGTGCGCTGTCCGAAATCAACAACAACTTACAGCGTATTCGTGAGCTGACGG 

TTCAGGCTTCTACCGGGACTAACTCCGATTCTGACCTGGACTCCATCCAGGACGAAATCA 

AGTCTCGTCTGGACGAAATTGACCGCGTATCCGGTCAGACCCAGTTCAACGGCGTGAACG 

TGCTGGCGAAAGACGGTTCGATGAAAATTCAGGTTGGTGCGAATGACGGCCAGACTATCA 

CGATTGATCTGAAGAAAATTGACTCAGATACGCTGGGGCTGAGTGGGTTTAATGTGAATG 

GTGGCGGGGCTGTTGCTAACACTGCTGCATCTAAAGCTGACTTGGTAGCTGCTAATGCAA 

CTGTGGTAGGCAACAAATATACTGTGAGTGCGGGTTACGATGCTGCTAAAGCGTCTGATT 

TGCTGGCTGGAGTTAGTGATGGTGATACTGTTCAGGCAACCATTAATAACGGCTTCGGAA 

CGGCGGCTAGTGCAACGAATTACAAGTATGACAGTGCAAGTAAGTCTTACTCTTTTGATA 

CCACAACGGCTTCAGCTGCCGATGTTCAGAAATATTTGACCCCGGGCGTTGGTGATACCG 

CTAAGGGCACTATTACTATCGATGGTTCTGCACAGGATGTTCAGATCAGCAGTGATGGTA 

AAATTACGTCAAGCAATGGAGATAAACTTTACATTGATACAACTGGGCGCTTAACGAAAA 

ACGGCTTTAGTGCTTCTTTGACTGAGGCTAGTCTGTCCACACTTGCAGCCAATAATACCA 

AAGCGACAACCATTGACATTGGCGGTACCTCTATCTCCTTTACCGGTAATAGTACTACGC 

CGAACACTATTACTTATTCAGTAACAGGTGCAAAAGTTGATCAGGCAGCTTTCGATAAAG 

CTGTATCAACCTCTGGAAACGATGTTGATTTCACTACCGCAGGTTATAGCGTCGACGGCG 

CAACTGGCGCTGTAACAAAAGGTGTTGCTCCGGTTTATATTGATAACAACGGGGCGTTGA 

CCACATCTGATACTGTAGATTTTTATCTACAGGATGATGGTTCAGTGACTAACGGCAGCG 

GTAAGGCAGTTTATAAAGATGCTGACGGTAAATTGACGACAGATGCTGAAACTAAAGCTG 

CAACCACCGCCGATCCCCTGT^AAGCTCTGGACGAAGCCATCAGCTCCATCGACAAATTCC 

GCTCCTCCCTCGGTGCGGTGCAGAACCGTCTGGATTCCGCGGTCACCAACCTGAACAACA 

CCACTACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCTGACTATGCGACCGAAG 

TATCCAACATGTCGAAAGCGCAGATCATCCAGCAGGCCGGTAACTCCGTGCTGGCAAAAG 

CTAACCAGGTACCACAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGC ACAAGTCATTAATAC CAACAGC 

CTCTCGCTGATCACTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATC 

GAGCGTCTGTCTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCG 

ATTGCTAACCGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAAC 

GACGGTATTTCTGTTGCACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTA 

CAGCGTGTGCGTGAACTGACCGTTCAGGCAACCACCGGTACCAACTCCCAGTCTGACCTG 

GACTCTATCCAGGACGAAATTAAATCCCGTCTGGACGAAATTGATCGCGTATCCGGTCAG 

ACCCAGTTCAACGGCGTGAACGTGCTGGCAAAAGACGGTTCCATGAAAATTCAGGTTGGC 

GCGAACGATGGCCAGACCATCACTATCGACCTGAAGAAGATTGACTCTTCTACCTTGAAC 

CTGACAGGTTTTAACGTTAACGGTTCTGGTTCTGTGGCGAATACTGCAGCAACTAAAGCT 

GATTTAACCGCTGCTCAACTCTCTGCACCGGGTGCAGCAGACGCAAATGGTACAGTTACT 

TATACTGTCAGTGCTGGTTATAAAGAATCCACTGCTGCAGATGTTATTGCTAGCATCAAA 

GACGGCAGTGCTCCGACTTCTGCAATTACTGCAACCATTAATAATGGCTTCGGTGATTCC 

AGTGCGCTGACTTCCAATGACTATACTTATGACCCAGCAAAAGGCGACTTCACTTACGAC 

GTAGCTTCAAGCGCCAATAATACTGCTGCCCAGGTTCAGTCCTTCCTGACGCCGAAAGCA 

GGTGATACCGCAAATCTGAAAGTAACCGTTGGTACGACATCGGTTGATGTCGTTCTGGCC 

AGTGATGGTAAGATTACAGCAAAAGATGGTTCTGCATTATATATCGACAGTACAGGTAAC 

CTGACTCAGAACAGTGCTGGCTTGACCTCTGCTAAACTGGCTACTCTGACTGGCCTTCAG 

GGCTCTGGTGTTGCTTCAACCATCACTACTGAAGATGGCACTAATATTGATATTGCTGCT 

AACGGTAATATTGGTCTGACCGGTGTTCGTATCAGTGCTGATTCTCTGCAGTCAGCGACT 

AAATCTACGGGCTTTACTGTTGGTACTGGCGCTACAGGTCTGACCGTAGGTACTGATGGT 

AAAGTGACTATCGGCGGGACTACTGCTCAGTCCTACACCAGCAAAGATGGTTCCCTGACT 

ACTGATAACACCACTAAACTGTATCTGCAGAAAGATGGCTCTGTAACCAACGGTTCAGGT 

AAAGCGGTCTATGTAGAAGCGGATGGTGATTTCACTACCGACGCTGCAACCAAAGCCGCA 

ACCACCACCGATCCGCTGAAAGCCCTGGATGAGGCAATCAGCCAGATCGATJUVGTTCCGT 

TCATCCCTGGGTGCTATCCAGAACCGTCTGGATTCCGCGGTCACCAACCTGAACAACACC 

ACTACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTG 

TCCAACATGTCGAAAGCGCAGATCATTCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCC 

AACCAGGTACCGCAACAGGTTCTGTCTCTGCTGCAGGGCTAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCAC 

TCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTC 

TGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTT 

TACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGT 

TGCACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTGTGCGTGA 

ACTGACCGTTCAGGCAACCACCGGTACCAACTCCCAGTCTGACCTGGACTCTATCCAGGA 

CGAAATTAAATCCCGTCTGGACGAAATTGATCGCGTATCCGGTCAGACCCAGTTCAACGG 

CGTGAACGTGCTGGCAAAAGACGGTTCCATGAAAATTCAGGTTGGCGCGAACGATGGCCA 

GACCATCACTATCGACCTGAAGAAGATTGACTCTTCTACCTTGAACCTGACAGGTTTTAA 

CGTTAACGGTTCTGGTTCTGTGGCGAATACTGCAGCAACTAAAGCTGATTTAACCGCTGC 

TCAACTCTCTGCACCGGGTGCAGCAGACGCAAATGGTACAGTTACTTATACTGTCAGTGC 

TGGTTATAAAGAATCCACTGCTGCAGATGTTATTGCTAGCATCAAAGACGGCAGTGCTCC 

GACTTCTGCAATTACTGCAACCATTAATAATGGCTTCGGTGATTCCAGTGCGCTGACTTC 

CAATGACTATACTTATGACCCAGCAAAAGGCGACTTCACTTACGACGTAGCTTCAAGCGC 

CAATAATACTGCTGCCCAGGTTCAGTCCTTCCTGACGCCGAAAGCAGGTGATACCGCAAA 

TCTGAAAGTAACCGTTGGTACGACATCGGTTGATGTCGTTCTGGCCAGTGATGGTAAGAT 

TACAGCAAAAGATGGTTCTGCATTATATATCGACAGTACAGGTAACCTGACTCAGAACAG 

TGCTGGCTTGACCTCTGCTAAACTGGCTACTCTGACTGGCCTTCAGGGCTCTGGTGTTGC 

TTCAACCATCACTACTGAAGATGGCACTAATATTGATATTGCTGCTAACGGTAATATTGG 

TCTGACCGGTGTTCGTATCAGTGCTGATTCTCTGCAGTCAGCGACTAAATCTACGGGCTT 

TACTGTTGGTACTGGCGCTACAGGTCTGACCGTAGGTACTGATGGTAAAGTGACTATCGG 

CGGGACTACTGCTCAGTCCTACACCAGCAAAGATGGTTCCCTGACTACTGATAACACCAC 

TAAACTGTATCTGCAGAAAGATGGCTCTGTAACCAACGGTTCAGGTAAAGCGGTCTATGT 

AGAAGCGGATGGTGATTTCACTACCGACGCTGCAACCAAAGCCGCAACCACCACCGATCC 

GCTGAAAGCCCTGGATGAGGCAATCAGCCAGATCGATAAGTTCCGTTCATCCCTGGGTGC 

TATCCAGAACCGTCTGGATTCCGCGGTCACCAACCTGAACAACACCACTACCAACCTGTC 

TGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAA 

AGCGCAGATCATTCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCCAACCAGGTACCGCA 

ACAGGTTCTGTCTCTGCTGCAGGGCTAA 



Figure 21 



wo 99/61458 



PCT/AU99/00385 



44/96 

GCGCTGTCGACTTCTATCGAGCGCCTCTCTTCTGGTCTGCGTATTAACAGCGCTAAA 

GATGACGCTGCGGGCCAGGCGATTGCTAACCGCTTCACTTCTAACATCAAAGGTCTGACT 

CAGGCCGCACGTAACGCCAACGACGGTATTTCTCTGGCGCAGACGGCTGAAGGCGCGCTG 

TCAGAGATTAACAACAACTTGCAGCGTATTCGTGAACTGACCGTTCAGGCCTCTACCGGC 

ACGAACTCTGATTCCGACCTGTCTTCTATTCAGGACGAAATCAAATCCCGTCTTGATGAA 

ATTGACCGTGTATCTGGTCAGACCCAGTTCAACGGTGTGAACGTGCTGTCGAAAAACGAT 

TCGATGAAGATTCAGATTGGTGCCAATGATAACCAGACGATCAGCATTGGCTTGCAACAA 

ATCGACAGTACCACTTTGAATCTGAAAGGATTTACCGTGTCCGGCATGGCGGATTTCAGC 

GCGGCGAAACTGACGGCTGCTGATGGTACAGCAATTGCTGCTGCGGATGTCAAGGATGCT 

GGGGGTAAACAAGTCAATTTACTGTCTTACACTGACACCGCGTCTAACAGTACTAAATAT 

GCGGTCGTTGATTCTGCAACCGGTAAATACATGGCAGCCACTGTAGTCATTACCAGTACG 

GCGGCGGCGGTAACTGTTGGTGCAACGGAAGTGGCGGGAGCCGCTACAGCCGAACCGTTA 

AAAGCACTGGATGCCGCAATCGCTAAAGTCGACAAATTCCGCTCCTCCCTCGGTGCCGTT 

CAAAACCGTCTGGATTCTGCGGTCACCAACCTGAACAACACCACCACCAACCTGTCTGAA 

GCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCG 

CAGATTATCCAGCAGGCG 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAATA 

ATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTTGC 

GTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTTACTTCTA 

ATATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAATGACGGTATTTCTGTTGCACAGA 

CCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATTCGTGAACTGACGG 

TTCAGGCCACTACAGGGACTAACTCCGATTCTGACCTGGACTCCATCCAGGACGAAATCA 

AATCTCGTCTGGACGAAATTGACCGCGTATCCGGTCAGACCCAGTTCAACGGCGTGAACG 

TGCTGTCCAAAGATGGTTCAATGAAAATTCAGGTCGGCGCAAATGATGGTGAAACCATCA 

CGATTGATCTGAAGAAAATTGACTCTGATACGCTGAATCTGGCTGGTTTTAACGTGAATG 

GCGAAGGTGAAACAGCCAATACTGCTGCAACACTTAAAGATATGGTTGGTTTAAAACTCG 

ATAATACGGGGGTCACTACAGCTGGAGTTAATAGATATATTGCTGACAAAGCCGTCGCAA 

GTAGCACGGATATTTTGAATGCGGTAGCTGGTGTTGATGGCAGTAAAGTTTCCACGGAGG 

CAGATGTTGGTTTTGGTGCAGCTGCCCCTGGTACGCCAGTGGAATATACTTATCATAAAG 

ATACTAACACATATACGGCTTCTGCTTCAGTTGATGCGACTCAACTGGCGGCATTCCTGA 

ATCCTGAAGCGGGTGGTACCACTGCTGCAACAGTAAGTATTGGCAACGGTACAACAGCTC 

AAGAGCAAAAAGTCATTATTGCTAAAGATGGTTCTTTAACTGCTGCTGATGACGGTGCCG 

CTCTCTATCTTGATGATACTGGTAACTTAAGTAAAACTT^CGCAGGCACTGATACTCAAG 

CTAAACTGTCTGACTTAATGGCTU^CAATGCTAATGCCAAAACAGTCATTACAACAGATA 

AAGGTACATTTACTGCTAATACGACAAAGTTTGATGGGGTAGATATTTCTGTTGATGCTT 

CAACGTTTGCTAACGCCGTTAAAAATGAGACTTACACTGCAACTGTTGGTGTAACTTTAC 

CTGCGACATATACAGTCAATAATGGCACTGCTGCATCAGCGTATTTAGTCGATGGAAAAG 

TGAGCAAAACTCCTGCCGAGTATTTTGCTCAAGCTGATGGCACTATTACTAGTGGTGAAA 

ATGCGGCTACCAGTAAAGCTATCTATGTAAGTGCCAATGGTAACTTAACGACTAATACAA 

CTAGTGAATCTGAAGCTACTACCAACCCGCTGGCAGCATTGGATGACGCTATCGCGTCTA 

TCGACAAATTCCGTTCTTCCCTGGGTGCTATCCAGAACCGTCTGGATTCCGCAGTCACCA 

ACCTGAACAACACCACTACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACT 

ATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATTCAGCAGGCCGGTAACTCCG 

TGCTGGCAAAAGCCAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAATAATAT 

CAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTTGCGTAT 

TAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTTACTTCTAACAT 

TAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGTTGCGCAGACCAC 

TGAAGGCGCGCTGTCCGAAATTAACAACAACTTACAGCGTATTCGTGAACTGACGGTTCA 

GGCGACGACCGGAACTAACTCCACCTCTGACCTGGACTCCATCCAGGACGAAATCAAATC 

CCGTCTTGACGAAATTGACCGCGTATCTGGTCAGACCCAGTTCAACGGCGTGAACGTGCT 

GTCTAAAGATGGCTCGATGAAAATTCAGGTCGGCGCGAACGATGGCGAAACGATTACTAT 

TGATCTGAAGAAAATTGACTCTGATACGCTGAATCTGGCTGGTTTTAACGTTAACGGTAA 

AGGTTCTGTAGCGAATACCGCTGCGACTACAGATAATCTGACATTGGCTGGTTTTACAGC 

GGGTACTAAAGCTGCTGATGGCACCGTAACTTATAGCAAAAATGTCCAGTTTGCCGCCGC 

GACTGCAAGCAATGTACTGGCTGCTGCTAAAGATGGCGACGAAATTACGTTCGCTGGTAA 

TAACGGCACAGGTATAGCTGCAACTGGGGGGACTTATACTTATCATAAGGACTCTAACTC 

ATACAGCTTTAGCGCAACGGCTGCATCTAAAGATTCTCTGTTGAGCACACTGGCACCAAA 

CGCTGGCGATACATTTACCGCTAAAGTGACTATTGGTTCTA7VATCGCAAGAAGTTAACGT 

TAGCAAAGATGGTACGATTACATCCAGCGATGGTAAGGCGCTGTATTTAGATGAGAAGGG 

CAACCTGACCCAAACAGGTAGTGGCACAACCAAAGCTGCAACCTGGGATAACCTGATGGC 

CAATACAGATACTACAGGCAAAGATGCCTATGGTAACTCTGCGGCAGCAGCTGTTGGGAC 

AGTAATCGAAGCAAAAGGAATGACCATCACTTCTGCTGGTGGTAATGCTCAGGTGTTAAA 

AGACGCGGCTTATAATGCCGCATATGCGACCTCAATTACTACTGGTACTCCGGGTGATGC 

GGGAGCCGCGGGAGCCGCTGCAACTGCGGGTAATGCCGCGGTGGGAGCGCTGGGCGCAAC 

GGCAGTTGATAATACCACGGCAGATGTTGCCGATATCTCTATCTCAGCTTCGCAftATGGC 

GAGCATCCTTCAGGATAAAGATTTCACCTTAAGTGATGGTAGTGATACTTACAACGTGAC 

CAGCAATGCTGTCACTATCAATGGCAAAGCAGCAAACATTGATGACAGCGGCGCAATCAC 

AGACCAAACCAGTAAAGTTGTCAATTATTTCGCTCATACTAACGGTAGCGTGACTAACGA 

TACAGGCTCCACTATTTATGCGACAGAAGATGGTAGCCTGACCACCGATGCAGCAACCAA 

AGCCGAAACCACCGCCGATCCCCTGAAAGCTCTGGACGAAGCCATCAGCTCCATCGACAA 

ATTCCGCTCCTCCCTCGGTGCGGTGCAAAACCGTCTGGATTCCGCGGTCACCAACCTGAA 

CAACACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGAC 

CGAAGTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGC 

AAAAGCTAACCAGGTACCACAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTG 

ATCACTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTG 

TCTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAAC 

CGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCGGCCCGTAACGCCAACGACGGTATT 

TCTGTTGCGCAGACCACCGAAGGCGCGCTGTCCGAAATTAACAACAACTTACAGCGTGTG 

CGTGAGCTGACTGTTCAGGCGACCACCGGTACCAACTCCCAGTCTGATCTGGACTCTATC 

CAGGACGAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGTCAGACCCAGTTC 

AACGGCGTGAACGTGCTGGCAAAAGACGGTTCCATGAAAATTCAGGTTGGCGCGAATGAT 

GGCCAGACCATCACTATCGACCTGAAGAAGATTGACTCTTCTACGTTGAAACTGACTGGT 

TTTAACGTGAATGGTTCTGGTTCTGTGGCGAATACTGCGGCGACTAAAGCGGATTTGGCT 

GCTGCTGCAATTGGTACCCCTGGGGCAGCAGATTCTACAGGTGCCATTGCTTACACAGTA 

AGTGCTGGGCTGACTAAAACTACAGCCGCAGATGTACTGTCTAGCCTCGCTGATGGTACG 

ACTATTACAGCCACAGGCGTGAAAAATGGCTTTGCTGCAGGAGCCACTTCCAATGCCTAT 

AAACTTAACAAAGATAATAATACATTTACTTATGACACGACTGCTACGACAGCTGAGCTG 

CAGTCTTACCTGACTCCGAAAGCGGGCGACACTGCAACATTCAGTGTTGAAATTGGTGGT 

ACTACACAAGACGTCGTGCTGTCCAGTGATGGCAAACTCACTGCTAAGGATGGCTCTAAG 

CTTTACATTGATACAACTGGTAATTTAACTCAGAATGGTGGTAATAACGGTGTTGGAACA 

CTCGCGGAAGCGACTCTGAGTGGTTTAGCTCTGAACAAAAATGGTTTAACGGCTGTTAAA 

TCCACAATTACTACAGCTGATAACACTTCGATTGTACTGAATGGTTCAAGCGATGGTACT 

GGTAATGCTGGTACTGAAGGTACGATTGCTGTTACAGGCGCTGTAATTAGTTCAGCTGCT 

CTGCAATCTGCAAGCAAAACGACTGGTTTCACTGTTGGTACAGTAGACACAGCTGGTTAT 

ATCTCTGTAGGTACTGATGGGAGTGTTCAGGCATATGATGCTGCGACTTCTGGCAACAAA 

GCTTCTTACACCAACACTGACGGTACACTGACTACTGATAACACCACTAAACTGTATCTG 

CAGAAAGATGGCTCTGTAACCAACGGTTCAGGTAAAGCGGTCTATGTAGAAGCGGATGGT 

GATTTCACTACCGACGCTGCAACCAAAGCCGCAACCACCACCGATCCGCTGGCCGCTCTG 

GATGACGCAATCAGCCAGATCGACAAGTTCCGTTCATCCTTGGGTGCTATCCAGAACCGT 

CTGGATTCTGCAGTCACCAACCTGAACAACACCACCACCAACCTGTCTGAAGCGCAGTCC 

CGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGTCGAAAGCGCAGATCATC 

CAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCCAACCAGGTACCGCAGCAGGTTCTGTCT 

CTGCTGCAGGGTTAA 
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AACAAATCTCAGTCTTCTCTGAGCTCCGCCATTGAA 

CGTCTCTCTTCTGGCCTGCGTATTAACAGTGCTAAAGATGACGCAGCAGGTCAGGCGATT 

GCTAACCGTTTTACAGCAAATATTAAAGGTCTGACTCAGGCTTCCCGTAACGCGAATGAT 

GGTATTTCTGTTGCGCAGACCACTGAAGGTGCGCTGAATGAAATTAACAACTU^CCTGCAG 

CGTATTCGTGAACTTTCTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATCTTTCT 

TCTATCCAGGCTGAAATTACTCAACGTCTGGAAGAAATTGACCGTGTATCTGAGCAAACT 

CAGTTTAACGGCGTGAAAGTCCTTGCTGAAAATAATGAAATGAAAATTCAGGTTGGTGCT 

AATGATGGTGAAACCATCACTATCAATCTGGCAAAAATTGATGCGAAAACTCTCGGCCTG 

GACGGTTTTAATATCGATGGCGCGCAGAAAGCAACCGGCAGTGACCTGATTTCTAAATTT 

AAAGCGACAGGTACTGATAATTATCAAATTAACGGTACTGATAACTATACTGTTAATGTA 

GATAGTGGCGTAGTACAGGATAAAGATGGCAAACAAGTTTATGTGAGTACTGCGGATGGT 

TCACTTACGACCAGCAGTGATACTCAATTCAAGATTGATGCAACTAAGCTTGCAGTGGCT 

GCTAAAGATTTAGCTCAAGGGAATAAGATTGTCTACGAAGGTATCGAATTTACAAATACC 

GGCACTGTCGCTATAGATGCCAAAGGTAATGGTAAATTAACCGCCAATGTTGATGGTAAG 

GCTGTTGAATTCACTATTTCGGGGAGTACTGATACATCAGGTACTAGTGCAACCGTTGCC 

CCTACGACAGCCCTATACAAAAATAGTGCAGGGCAATTGACTGCAACAAAAGTTGAAAAT 

AAAGCAGCGACACTATCTGATCTTGATCTGAACGCTGCCAAGAAAACAGGAAGCACGTTA 

GTTGTTAACGGTGCAACTTACGATGTTAGTGCAGATGGTAAAACGATAACGGAGACTGCT 

TCTGGTAACAATAAAGTCATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGATTCTGGTA 

AACGAAGATGCAGCAAAATCGTTGCAATCTACCACCAACCCGCTCGAAACTATCGACAAA 

GCATTGGCTAAAGTTGACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGAC 

TCTGCCATCACCAACCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGCCGTATC 

GAAGATGCTGACTACGCGACCG7VAGTGTCTAACATGTCTCGTGCGCAGATCCTGCAACAA 

GCGGGTACCTCTGTTCTGGCACAG 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAATA 

ATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTTGC 

GTATTAACAGCGCGAAGGATGACGCAGCGGGTCAGGCGATTGCTAACCGTTTTACTTCTA 

ACATTAAAGGCCTGACTCAGGCGGCACGTAACGCCAACGACGGTATCTCTCTGGCGCAGA 

CCACCGAAGGTGCGCTGTCTGAAATCAACAACAACTTACAGCGTGTACGTGAACTGACCG 

TTCAGGCAACCACCGGTACTAACTCCGACTCCGACCTGGCTTCTATTCAGGACGAAATCA 

AATCCCGTCTGGATGAAATTGACCGCGTATCTGGTCAGACTCAGTTCAACGGCGTGAACG 

TGCTGGCAAAAGACGGTTCCATGAAAATTCAGGTAGGTGCTAACGACGGCCAGACTATCA 

CTATTGACCTGAAAAAAATCGACTCTGATACTCTGGGCCTGAATGGTTTTAACGTGAATG 

GTTCTGGGACGATTACCAACAAAGCAGCAACTGTCAGTGATGTTACTCGCGCAGGCGGTA 

CATTGGTGAATGGTGCCTATGATATAAAAACCACTAACACAGCGCTGACTACAACTGATG 

CCTTCGCGAAATTGAATGATGGTGATGTTGTTACTATCAATAATGGTAAGGATACTGCCT 

ATAAATATAATGCTGCTACAGGTGGGTTTACGACGGATGTCTCCATCTCCGGGGATCCTA 

CCGCTGCTGACGCTACTGCTAATAAAACTGCCCGTGATGCACTTGCGGCGTCTTTACATG 

CTGAGCCGGGTAAAACTGTTAATGGTTCTTGGACTACGAATGATGGTACGGTAAAATTTG 

ATACCGATGCCGATGGTAAGATTTCTATTGGTGGTGTTGCTGCTTATGTAGATGCAGCAG 

GCAACCTGACCACTAACGCAGCAGGTATGACGACTCAAGCAACAACTACCGATTTGGTTA 

CTGCTGCTGCATCTGCTACTGGTAAGGGTGGATCCCTGACCTTTGGTGACACGACGTATA 

AAATTGGTCAGGGTACGGCTGGGGTTGATCCTGATGACGCTTCAGATGATGTACTGGGCA 

CCATTTCTTACTCTAAATCAGTAAGCAAGGATGTTGTTCTTGCTGATACTAAAGCAACTG 

GTAACACGACAACAGTTGATTTCAACTCCGGTATCATGACTTCAAAGGTTAGTTTCGATG 

CAGGTACATCAACTGATACATTCAAAGATGCAGATGGTGCTATCACCAAAACTAAAGAAT 

ACACCACTTCTTATGCTGTAAATAAAGATACTGGTGAAGTTACCGTTGCTGATTATGCTG 

CGGTAGATAGCGCCGATAAGGCTGTTGATGATACTAAATATAAACCGACTATCGGCGCGA 

CAGTTAACCTGAATTCTGCAGGTAAATTGACCACTGATACCACCAGTGCAGGCACAGCAA 

CCAAAGATCCTCTGGCTGCCCTGGACGCTGCTATCAGCTCCATCGACAAATTCCGTTCAT 

CCCTGGGTGCTATCCAGAACCGTCTGGATTCCGCAGTCACCAACCTGAACAACACCACTA 

CCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCA 

ACATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCCAACC 

AGGTACCGCAGCAGGTTCTGTCTCTGCTACAGGGTTAA 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTATC 

GAGCGCCTTTCTTCTGGTCTGCGTATTAACAGCGCTAAAGATGACGCTGCGGGCCAGGCG 
ATTGCTAACCGCTTCACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAAC 
GACGGTATTTCTCTGGCGCAGACCACTGAAGGCGCGCTGTCTGAGATTTU^CAACAACTTG 
CAGCGTGTGCGTGAGTTGACTGTACAGGCGACGACCGGGACTAACTCTGATTCTGACCTG 
TCTTCTATCCAGGATGAAATCAAATCCCGTTTAAGCGAAATTGACCGTGTATCTGGTCAG 
ACTCAGTTTAACGGCGTGAACGTACTGGCTAAGAATGACACCCTGTCTATTCAGGTAGGT 
GCAAATGACGGTCAGACTATCAATATTGACCTGCAGCAAATCGATTCTCATACACTGGGT 
CTGGATGGTTTCAGCGTTAAAAATAATGATGCAGTGAAAACCAGTGCTGCCGTGAATACT 
CTTGGGGGGGGGGCAGGTTCTGTTGCTGTCGACTTCGCAACAACCAGTTTGACTGCTATC 
ACTGGTCTCGGTAGCGGTGCTATCAGCGAAATTGCTAAAGACGATAATGGTGATTACTAC 
GCGCATGTCACAGGGACTACGGGT7UVTACTGCTGATGGTTACTATGCTGTCGATATCGAC 
AAGGCTACCGGTGAGGTCGCTCTGAAAGATGGTAACGTAGATACACCGACAGGTACGCCA 
ACGACGACAAGCACATATGACTTCACAGACGCTGGTCAAACCGTTTCCTTTGGCACTGAT 
GCTGCAACAGCCGGTATCAGCACTGGTGCTTCTCTCGTTAAACTTCAGGATGAGAAAGGC 
AATGATACTGCTACTTATGCAATCAAAGCACAAGATGGCAGCCTGTATGCCGCCAACGTT 
GATGAGGCTACCGGTAAAGTCACTGTCAAAACCGCCAGCTATACTGATGCTGACGGCA/^ 
GCAGTGACCGATGCCGCTGTAAAACTGGGTGGTGACAATGGCACAACCGAAATTGTTGTC 
GATGCTGCGTCAGGTAAAACTTACGATGCTGGTGCACTGCAAAACGTTGATCTCTCCAGT 
GCAACCAACACGGTAACCGCAATCCCGAACGGTAAAACCACGTCTCCGCTGGCTGCCCTT 
GACGACGCAATCAGCCAGATCGACAAATTCCGCTCCTCCCTCGGTGCGGTGCAGAACCGT 
CTGGATTCCGCGGTCACCAACCTGAACAACACCACTACCAACCTGTCTGAAGCGCAGTCC 
CGTATTCAGGACGCTGACTATGCGACCGAAGTATCCAACATGTCGAAAGCGCAGATCATC 
CAGCAGGCAGGTAACTCCGTGCTGTCCAAA 
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GCGCTGTCGACTTCTATCGAGCGCCTCTCTTCTGGTCTGCGCATTAACAGCGCTAAAG 

ATGACGCTGCGGGCCAAGCGATTGCTAACCGCTTCACTTCTAACATCAAAGGTCTGACTC 

AGGCCGCACGTAACGCCAACGACGGTATTTCTCTGGCGCAGACCACTGAAGGCGCACTGT 

CTGAAATCAACAACAACTTGCAGCGTGTTCGTGAACTGACCGTTCAGGCCACTACCGGTA 

CTAACTCTGATTCTGACCTGTCTTCAATACAGGACGAAATCAAATCCCGTCTCGATGAAA 

TTGACCGCGTATCCGGTCAGACTCAGTTCAACGGCGTTAATGTTCTTTCCAAAGATGGTT 

CAATGAAAATTCAGGTTGGTGCGAATGATGGTCAAACTATCTCCATCGATCTGAAGAAAA 

TTGATTCTTCAACTTTGGGGCTGAATGGCTTCTCAGTTTCTAAAAACTCTCTTAATGTCA 

GCAATGCTATCACATCTATCCCGCAAGCCGCTAGCAATGAACCTGTTGATGTTAACTTCG 

GTGATACTGATGAGTCTGCAGCAATCGCAGCCAAATTGGGGGTTTCCGATACGTCAAGCC 

TGTCGCTGCACAACATCCTTGATAAAGATGGTAAGGCAACAGCTGATTATGTTGTTCAGT 

CAGGTAAAGACTTCTATGCTGCTTCTGTTAATGCCGCTTCAGGTAAAGTAACCTTAAACA 

CCATTGATGTTACTTATGATGATTATGCGAACGGTGTTGACGATGCCTLAGCAAACAGGTC 

AGCTGATCAAAGTTTCAGCAGATAAAGACGGCGCAGCTCAAGGTTTTGTCACACTTCAAG 

GCAAAAACTATTCTGCTGGTGATGCGGCAGACATTCTTAAGAATGGAGCAACAGCTCTTA 

AGTTAACTGATCTGAATTTAAGTGATGTTACTGATACTAATGGTAAGGTAACCACAACTG 

CGACTGAGCAATTTGAAGGTGCTTCAACTGAGGATCCGCTGGCGCTTCTGGATAAAGCTA 

TTGCATCAGTCGACAAATTCCGGTCTTCTCTAGGTGCCGTGCAGAACCGTCTCGATTCCG 

CTATCACCAACCTGAACAACACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGG 

ACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATCCAGCAGGCA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCG 

CTGATCACTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGT 
CTGTCTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCT 
AACCGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGT 
ATTTCTGTTGCACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGT 
ATTCGTGAACTGACGGTTCAGGCCACTACAGGGACTAACTCCGATTCTGACCTGGACTCC 
ATCCAGGACGAAATCAAATCTCGTCTGGACGAAATTGACCGCGTATCTGGTCAGACCCAG 
TTCAACGGCGTGAACGTGCTGTCTAAAGATGGCTCGATGAAAATTCAGGTCGGCGCGAAC 
GATGGCGAAACGATTACTATTGATCTGAAGAAAATTGACTCTGATACGCTAAATCTGGCT 
GGTTTTAACGTGAATGGTGCTGGCTCTGTTGATAATGCCAAGGCGACTGGCAAAGATCTT 
ACTGATGCTGGTTTTACGGCAAGCGCAGCTGATGCTAATGGCAAAATCACTTATACCAAA 
GACACCGTTACTAAATTCGACAAAGCGACAGCGGCTGATGTATTGGGCAAAGCGGCTGCT 
GGCGATAGCATTACCTATGCGGGCACTGATACTGGCTTAGGAGTCGCTGCTGATGCCTCG 
ACTTACACCTACAATGCAGCCAATAAGTCTTACACTTTTGATGCTACTGGTGTTGCCAAG 
GCGGATGCTGGAACGGCACTGAAAGGGTACTTAGGCGCATCTAACACCGGTAAAATTAAT 
ATCGGTGGTACCGAGCAAGAAGTTAACATTGCCAAAGATGGCTCCATCACCGATACCAAT 
GGCGATGCGCTGTATCTCGATAGTACCGGCAACTTAACCAAAAATACCGCGAATTTGGGG 
GCTGCTGATAAAGCAACTGTAGATAAACTGTTTGCTGGTGCTCAGGATGCAACGATCACC 
TTCGATAGCGGCATGACAGCTAAATTCGATCAAACTGCTGGTACCGTTGATTTCAAAGGC 
GCGTCTATTTCTGCTGATGCAATGGCATCAACCTTAAATAATGGTTCCTATACAGCCAAC 
GTAGGTGGTAAGGCTTATGCCGTAACCGCTGGCGCAGTTCAGACAGGTGGCGCAGATGTG 
TATAAAGATACCACTGGCGCACTGACGACTGAAGATGACGAAACCGTTACCGCGACCTAC 
TACGGTTTTGCTGATGGTAAAGTTTCTGACGGTGAAGGTTCTACTGTCTATAAAGCTGCT 
GATGGTTCCATCACTAAAGATGCGACTACCAAGTCTGAAGCAACCACTGACCCTCTGAAA 
GCCCTTGACGACGCAATCAGCCAGATCGACAAATTCCGCTCCTCCCTCGGTGCCGTTCAA 

aaccgtctggattccgccgtcaccaacctgaacaacaccactaccaacctgtctgaagcg 
cagtcccgtattcaggacgccgactatgcgaccgaagtgtccaacatgtcgaaagcgcag 
atcattcagcaggccggtaactccgtgctggcaaaagccaaccaggtaccgcagcaggtt 

CTGTCTCTGCTGCAGGGTTAA 
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AACAAATCTCAGTCTTCTCTTAGCTCTGCTATTGA 

GCGTCTCTCTTCTGGCCTGCGTATTAACAGTGCTAAAGATGACGCAGCAGGTCAGGCGAT 
TGCTAACCGTTTTACGGCAAATATTAAAGGTCTGACTCAGGCTTCCCGTAACGCGAATGA 
TGGTATTTCTGTTGCGCAGACTACTGAAGGTGCGCTGAATGAAATTAACAACAACCTGCA 
GCGTGTACGTGAACTGACTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATCTTTC 
TTCTATTCAGGCAGAAATTACTCAACGTCTGGAAGAAATTGACCGTGTATCTGAGCAAAC 
TCAGTTTAACGGCGTGAAAGTCCTTGCCGAAAATAATGAAATGAAAATTCAGGTTGGTGC 
TAATGATGGGGAAACCATCACTATCAATCTGGCAAAAATTGATGCGAAAACTCTCGGCCT 
GGACGGCTTTAATATCGATGGCGCGCAGAAAGCAACTGGCAGTGACCTGATTTCTAAATT 
TAAAGCGACAGGTACTGATAATTATCAAATTAACGGTACTGATAACTATACTGTTAATGT 
AGATAGTGGAGCAGTTCAAAATGAGGATGGTGACGCAATTTTTGTTAGCGCTACCGATGG 
TTCTCTGACTACTAAGAGTGATACAAAAGTCGGTGGTACAGGTATTGATGCGACTGGGCT 
TGCAAAAGCCGCAGTTTCTTTAGCTAAAGATGCCTCAATTAAATACCAAGGTATTACTTT 
CACCAACAAAGGCACTGATGCATTTGATGGCAGTGGTAACGGCACTCTAACCGCTAATAT 
TGATGGCAAAGATGTAACCTTTACTATTGATGCGACAGGGAAGGACGCAACATTAAAAAC 
GTCTGATCCTGTTTACAAAAATAGTGCAGGTCAGTTCACTACAACTAAGGTTGAAAACAA 
AGCCGCTACAGCATCGGATCTGGACTTAAATAACGCTAAAAAAGTGGGTAGTTCTTTAGT 
TGTAAATGGCGCTGATTATGAAGTTAGCGCTGATGGTAAGACAGTAACTGGGCTTGGCAA 
AACTATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGATTCTGGTAAAAGAAGATGCAGC 
AAAATCGTTGCAATCTACTACCAACCCGCTCGAAACCATCGACAAGGCATTGGCTAAAGT 
TGACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGACTCTGCTATCACCAA 
CCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGCCGTATCGAAGATGCTGACTA 
CGCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCCTGCAACAAGCGGGTACCTCTGT TCTGGCGCAG 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAATA 

ATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTTGC 

GTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTTACTTCTA 

ACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGATGGTATTTCTGTTGCACAGA 

CCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTGAACTGACGG 

TTCAGGCTTCTACCGGGACTAACTCCGATTCGGATCTGGACTCCATTCAGGACGAAATCA 

AATCCCGTCTGGACGAAATTGACCGCGTATCTGGCCAGACCCAGTTCAACGGCGTGAACG 

TACTGGCGAAAGACGGTTCAATGAAAATTCAGGTTGGTGCGAATGACGGCCAGACTATCA 

CGATTGATCTGAAGAAAATTGACTCTGATACGCTGGGGCTGAGTGGGTTTAATGTGAATG 

GTAGCGGGGCTGTGGCTAATACTGCAGCGACTAAATCTGATTTGGCAGCAGCTCAACTCT 

TGGCTCCAGGTACTGCTGATGCTAATGGTACAGTTACCTATACTGTTGGCGCAGGCCTGA 

AAACATCTACAGCTGCAGATGTAATTGCGAGTTTGGCTAATAACGCAAAAGTTAATGCCA 

CAATTGCAAATGGTTTTGGATCGCCAACAGCTACAGATTATACATACAACAGCGCTACAG 

GCGATTTTACATATAGTGCAACTATTGCAGCTGGTACAAATTCTGGTGATAGTAACAGTG 

CTCAGTTACAATCCTTCCTGACACCAAAAGCGGGCGATACTGCTAACTTAAACGTTAAAA 

TTGGTTCTACGTCAATTGACGTTGTATTGGCTAGCGACGGTAAAATTACCGCGAAAGATG 

GTTCAGAACTATTTATTGACGTAGATGGTAACCTCACTCAAAACAATGCTGGGACTGTCA 

AAGCAGCCACTCTTGATGCACTGACTAA7VAACTGGCATACAACAGGCACACCGAGTGCCG 

TATCTACGGTAATTACAACTGAAGATGAAACAACCTTCACTCTGGCTGGCGGTACTGATG 

CTACTACTTCTGGTGCAATCACTGTAGCAAATGCAAGAATGAGTGCTGAGTCTCTTCAAT 

CGGCAACTAAGTCCACAGGATTCACAGTTGATGTTGGAGCTACTGGTACCAGCGCAGGCG 

ATATTAAAGTTGATAGTAAAGGTATAGTACAACAACACACAGGTACAGGTTTTGAAGACG 

CTTACACCAAAGCTGATGGTTCACTGACTACCGATAATACAACCAATCTGTTTTTGCAAA 

AAGACGGAACTGTGACCAATGGTTCAGGTAAAGCAGTCTATGTTTCAGCGGATGGTAATT 

TTACTACTGACGCTGAAACTAAAGCTGCAACCACCGCCGATCCACTGAAAGCTCTGGACG 

AAGCGATCAGCTCCATCGACAAATTCCGTTCTTCCCTCGGTGCGGTGCAAAACCGTCTGG 

ATTCCGCAGTCACCAACCTGAACAACACCACTACTAACCTGTCTGAAGCGCAGTCCCGTA 

TTCAGGACGCTGACTATGCGACCGAAGTGTCCAATATGTCGAAAGCGCAGATCATCCAGC 

AGGCCGGTAACTCCGTGCTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTGTCTCTGC TGCAGGGTTAA 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTATCGAGCGCCTCTCTT 

CTGGTCTGCGCATTAACAGCGCTAAAGATGACGCTGCGGGCCAGGCGATTGCTAACCGCT 
TCACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAACGACGGTATCTCTC 
TGGCGCAGACCACTGAAGGCGCACTGTCTGAAATCAACAACAACTTGCAGCGTGTTCGTG 
AGCTGACCGTTCAGGCCACTACCGGTACTAACTCTGATTCTGACCTGTCTTCAATCCAGG 
ACGAAATCAAATCCCGTCTCGATGAAATTGACCGCGTATCCGGTCAGACTCAGTTCAACG 
GCGTGAACGTACTGGCAAAAGATAACACCATGAAGATTCAGGTTGGTGCGAACGATGGTC 
AGACTATATCCATCGACCTGCAAAAAATCGACTCTTCTACTCTTGGTTTGAACGGTTTCT 
CCGTTTCTAAAAATGCTCTCGAAACTAGCGAAGCGATCACTCAGTTGCCGAACGGTGCGA 
ATGCACCAATCGCTGTGAAGATGGATGCGTCTGTTCTGACCGATCTTAACATTACTGATG 
CTTCCGCTGTTTCGCTGCACAACGTAACTAAAGGTGGTGTCGCAACGTCTACTTATGTTG 
TTCAGTATGGCGATAAGAGCTATGCAGCATCTGTTGATGCGGGAGGTACAGTAAAACTGA 
ATAAAGCCGACGTAACATATAACGACGCAGCAAATGGTGTTACGAATGCCACCCAGATTG 
GTAGTCTGGTTCAGGTTGGTGCTGATGCAAACAATGATGCAGTTGGTTTTGTTACCGTGC 
AGGGGAAAAACTATGTTGCTAATGACTCATTAGTCAATGCTAATGGCGCTGCTGGCGCTG 
CAGCAACTAGAGTTACAATTGATGGTGATGGTAGCCTTGGAGCTAACCAGGCTAAAATTG 
AACTTAGCCAAAATGGTGCTACTGCTGCAACATCAGAGTTCGCTGGTGCTTCAACCAACG 
ATCCACTGACTCTGCTGGACAAAGCTATCGCATCTGTTGATAAATTCCGTTCTTCTTTGG 
GGGCGGTACAGAACCGTCTGAGCTCCGCTGTAACCAACCTGAACAACACCACTACCAACC 
TGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAACATGT 
CGAAAGCGCAGATCATCCAGCAGGCAGGTAACTCCGTGCTGTCCAAA 
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atggcacaagtcattaataccaacagcctctcgctgatcactcaaaataatatcaacaaga 

accagtctgcgctgtcgagttctatcgagcgtctgtcttctggcttgcgtattaacagcg 

cgaaggatgacgccgcaggtcaggcgattgctaaccgttttacttctaacattaaaggcc 

tgactcaggctgcacgtaacgccaacgacggtatttctgttgcacagaccactgaaggcg 

cgctgtccgaaatcaacaacaacttacagcgtattcgtgaactgacggttcaggcgacga 

ccggaactaactccacctctgacctggactccattcaggacgaaatcaaatcccgtcttg 

atgaaattgaccgcgtatccggccaaacccagttcaacggcgtgaacgtactgtcaaaag 

atggctcgatgaaaattcaggtcggcgcaaatgatggtgaaaccatcacgattgatctga 

aaaagatcgactcttctacattg;\agctgaccagcttcaatgttaacggtaaaggcgctg 

ttgataatgctaaagccactgaagcagatctgaccgctgcgggcttctcccaaggtgcag 

tcgtcagtggcaacagcacctggactaaatctactgttactacctttaatgcagcaacag 

ctaccgacgtgctggcaagcgttagcggcggcagcactattagcggttataccggtacaa 

acaatggattaggcgtagcggcttctactgcatatacctacaacgcaaccagcaagtctt 

attcatttgacgcaaccgcacttaccaatggcgatggtactggggccaccactaaagttg 

ctgatgtgctgaaagcctatgcagcaaacggtgataatacggctcagatctccatcggcg 

gaagcgctcaggacgttaaaattgccagcgatggcaccctgactgacgtcaatggtgatg 

ctttatatattggttctgacggcaacctgactaaaaaccaggccggcggtccagatgcgg 

CAACGTTGGACGGTATTTTCAACGGTGCGAATGGTAATGCAGCAGTTGATGCGAAGATTA 

cattcggcagcggcatgaccgttgatttcacccaggctagcaaaaaagtggatattaagg 
gcgcaacggtatccgccgaagatatggacactgcgttaactgggcaggcttataccgtag 
ctaacggcgcacagtcttttgacgttgccgctggtggggcagtaaccgctactacaggtg 

GCGCTACCGTAAATATTGGTGCTGATGGTGAACTGACGACTGCGACCAACAAGACTGTCA 

cagaaacttatcacgaatttgctaacggcaatattctggatgatgacggcgcggctctgt 
acaaagcggctgacggttctctgaccactgaagctactggtaaatccgaagtgaccacgg 
atccgctgaaagcgctggacgatgctatcgcatccgtagacaaattccgctcctccctcg 
gtgcggtgcagtj^ccgtctggattccgcagtcaccaacctgaacaacaccactaccaacc 
tgtctgaagcgcagtcccgcattcaggacgccgactatgcgaccgaagtgtccaatatgt 
cgaaagcgcagatcatccagcaggccggtaactccgtgctggcaaaagccaaccaggtac 
cgcagcaggttctgtctctgctgcagggttaa 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCAC 

TCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTC 

TGGCTTGCGTATTAACAGCGCTAAGGATGACGCCGCGGGTCAGGCGATTGCTAACCGTTT 

TACTTCTAACATTAT^GGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGT 

TGCGCAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTGA 

ACTGACGGTTCAGGCTTCTACCGGGACTAACTCCGATTCGGATCTGGACTCCATTCAGGA 

CGAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCTGGCCAGACCCAGTTCAACGG 

CGTGAACGTACTGGCGAAAGACGGTTCAATGAAAATTCAGGTTGGTGCGAATGACGGCCA 

GACTATCACTATTGATCTGAAGAAAATTGACTCAGATACGCTGGGGCTGAGTGGGTTTAA 

TGTGAATGGTGGCGGGGCTGTTGCTAATACTGCAGCGACTAAAGATGATTTGGTCGCTGC 

ATCAGTTTCAGCTGCGGTAGGTAATGAATACACTGTCTCTGCTGGCCTGTCGAAATCAAC 

TGCTGCTGATGTTATTGCTAGTCTCACAGATGGTGCGACAGTAACTGCGGCTGGTGTAAG 

CAATGGTTTTGCTGCAGGGGCAACTGGAGATGCTTATAAATTCAATCAAGCAAACAACAC 

TTTTACTTACAATACCACCTCAACAGCGGCAGAACTCCAATCTTACCTCACGCCTAAGGC 

GGGGGATACCGCAACTTTCTCCGTTGAAATTGGTGGCACCAAGCAGGATGTTGTTCTGGC 

TAGTGATGGCAAAATCACAGCAAAAGACGGGTCTAAACTTTATATTGACACCACAGGGAA 

TTTAACCCAAAACGGTGGAGGTACTTTAGAAGAAGCTACCCTCAATGGCTTAGCTTTCAA 

CCACTCTGGTCCAGCCGCTGCTGTACAATCTACTATTACTACTGCGGATGGAACTTCAAT 

AGTTCTAGCAGGTTCTGGCGACTTTGGAACAACAAAAACTGCTGGGGCTATTAATGTCAC 

AGGAGCAGTGATCAGTGCTGATGCACTTCTTTCCGCCAGTAAAGCGACTGGGTTTACTTC 

TGGCACTTATACCGTAGGTACAGATGGAGTTGTTAAATCTGGTGGCAATGACGTTTATAA 

CAAAGCTGACGGGACGGGATTAACTACTGACAATACCACAAAATATTATTTACAAGATGA 

CGGGTCTGTAACTAATGGTTCTGGTAAAGCTGTGTATGCTGATGCAACAGGAAAACTAAC 

TACTGACGCTGAAACTAAAGCCGAAACCACCGCCGATCCCCTGAAAGCTCTGGACGAAGC 

GATCAGCTCCATCGACAAATTCCGTTCTTCCCTCGGTGCGGTGCAAAACCGTCTGGATTC 

CGCGGTCACCAACCTGAACAACACCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTCA 

GGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATCCAGCAGGC 

CGGTAACTCCGTGCTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCA GGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCAC 

TCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTC 
TGGCTTGCGTATT7VACAGCGCGAAGGATGACGCCGCGGGTCAGGCGATTGCTAACCGTTT 
TACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCCGT 
TGCGCAGACCACCGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTGA 
ACTGACGGTTCAGGCCACTACCGGTACTAACTCCGATTCTGACCTGGACTCCATCCAGGA 
CGAAATCAAATCTCGTCTTGATGAAATTGACCGCGTATCTGGTCAGACCCAGTTCAATGG 
CGTGAATGTGTTGTCCAAAGACGGTTCAATGAAAATTCAGGTGGGCGCAAATGATGGTGA 
AACCATCACGATTGACCTGAAAAAAATCGACTCTTCTACACTGAAGCTGACCAGCTTCAA 
CGTCAACGGTAAAGGCGCTGTTGATAATGCAAAAGCCACTGAAGCAGATCTGACCGCTGC 
GGGCTTCTCCCAAAGTGCAGTTGTCAGTGGCAATAGCACCTGGACTAAATCTACTGTTAC 
TACCTTTAATGCAGCAACAGCTACCGATGTGCTGGCTAGCGTTAGTGGCGGCAGCACTAT 
TAGCGGTTATGCTGGCACAAACAATGGGTTAGGCGTAGCGGCTTCTACTGCATATACCTA 
CAACGCAACCAGCAAGTCTTATTCATTTGACGCAACCGCACTTACTAATGGTGATGGTAC 
TGCGGGCTCAACTAAAGTTGCTGATGTTCTGAAAGCCTATGCAGCAAACGGCGATAACAC 
GGCTCAGATCTCCATCGGTGGTAGCGCTCAGGAAGTTAAAATTGCCAGCGATGGTACCCT 
GACGGATACTTU^TGGCGATGCTTTATACATTGGTGCTGACGGTAACCTGACGAAAAACCA 
GGCCGGCGGCCCAGCCGCGGCAACGTTGGACGGTATTTTCAACGGTGCGAATGGTCATGA 
TGCAGTTGATGCGAAGATTACCTTCGGCAGCGGCATGACCGTTGACTTCACCCAGGTTAG 
CAACAATGTGGATATTAAGGGCGCGACGGTATCCGCCGAAGATATGAACACTGCGTTAAC 
CGGTCAGGCTTATACCGTAGCTAACGGCGCACAGTCTTATGACGTTGCCGCTGATGGTGC 
AGTAACTGCTACTACAGGTGGAGCGACCGTAAATATTGGTGCTGAGGGTGAACTGACGAC 
TGCGGCCAACAAGACTGTCACAGAAACTTATCACGAATTTGCTAACGGCAATATTCTGGA 
TGATGACGGCGCGGCTCTGTATAAAGCGGCTGACGGCTCTCTGACCACTGAAGCTACAGG 
TAAATCTGAAGCGACCACGGATCCGCTGAAAGCGCTGGACGATGCTATCGCATCCGTAGA 
CAAATTCCGTTCTTCCCTGGGTGCCGTGCAGAACCGTCTGGATTCCGCAGTCACCAACCT 
GAACAACACCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGC 
GACCGAAGTGTCCAACATGTCGAAAGCGCAGATTATTCAGCAGGCAGGTAACTCCGTGCT 
GGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTAT 

CGAGCGCCTCTCTTCTGGTCTGCGCATTAACAGCGCTAAAGATGACGCTGCGGGCCAGGC 

GATTGCTAACCGCTTCACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAA 

CGACGGTATCTCTCTGGCGCAGACCACTGAAGGCGCACTGTCTGAAATCAACAACAACTT 

GCAGCGTGTGCGTGAGTTGACTGTTCAGGCGACGACCGGGACTAACTCTGATTCTGACCT 

GTCTTCTATTCAGGACGAAATCAAATCCCGTCTGGATGAAATTGACCGTGTTTCCGGTCA 

GACCCAGTTCAACGGCGTGAACGTGCTGGCTAAAAACGGTTCTATGGCGATTCAGGTTGG 

CGCGAATGATGGGCAGACCATCAACATCGACCTGCAGAAAATCGACTCTTCTACTCTGGG 

CCTGGGCGGCTTCTCCGTATCTAACAATGCACTGAAACTGAGCGATTCTATCACTCAGGT 

TGGTGCGAGTGGTTCACTGGCAGATGTGAAACTGAGCTCTGTTGCCTCGGCTCTGGGTGT 

AGACGCAAGCACTCTGACTCTGCACAACGTACAGACCCCAGCTGGCGCAGCAACAGCTAA 

CTATGTTGTCTCTTCTGGTTCTGACAACTACTCAGTATCTGTTGAAGATAGCTCCGGTAC 

AGTTACGCTGAACACCACTGATATAGGTTATACCGATACCGCTAATGGCGTTACTACCGG 

TTCCATGACTGGTAAGTACGTTAAAGTTGGAGCTGATGCATTGGGTGCTGCTGTAGGTTA 

TGTCACCGTACAGGGACAAAACTTCAAAGCTGATGCTGGCGCGCTGGTTAACTCCAAGAA 

TGCTGCTGGTAGTCAGAATGTTACTTCTGCAATTGGCGATATTGCTAATAAAGCGAATGC 

TAACATTTACACTGGAACCTCTTCTGCAGATCCACTGGCTCTGCTGGACAAAGCTATCGC 

ATCTGTTGATAAATTCCGTTCTTCTCTAGGGGCGGTGCAGAACCGTCTGAGCTCTGCTGT 

AACCAACCTGAACAACACCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGC 

CGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATCCAGCAGGCGGGTAA 

CTCCGTGCTGTCTAAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCA 

CTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTT 

CTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCCGGTCAGGCGATTGCTAACCGTT 

TTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAATGACGGTATTTCTG 

TTGCACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATTCGTG 

AACTGACGGTTCAGGCTTCTACCGGGACTAACTCTGATTCGGATCTGGACTCCATTCAGG 

ACGAAATCAAATCCCGTCTCGACGAAATTGACCGCGTATCCGGTCAGACCCAGTTCAACG 

GCGTGAACGTACTGGCAAAAGACGGTTCGATGAAAATTCAGGTTGGTGCGAACGACGGCC 

AGACTATCACTATTGATCTGAAGAAAATTGACTCTGATACGCTGGGGCTGAGTGGGTTTA 

ACGTAAATGGTAGCGCAGATAAGGCAAGTGTCGCGGCGACAGCTGACGGAATGGTTAAAG 

ACGGATATATCAAAGGGTTAACTTCATCTGACGGCAGCACTGCATATACTAAAACTACAG 

CAAATACTGCAGCAAAAGGATCTGATATTCTTGCGGCGCTTAAGACTGGCGATAAAATTA 

CCGCAACAGGTGCAAATAGCCTTGCTGATAATGCGACATCGACAACTTATACTTATAATG 

CAACCAGCAATACCTTCTCCTATACGGCTGACGGTGTAAACCAAACGAATGCTGCAGC7UV 

ATCTCATACCTGCAGCAGGGAAAACGACAGCTGCATCAGTTACTATTGGTGGGACAGCAC 

AGAATGTAAATATTGATGATTCGGGCAATATTACTTCAAGTGATGGCGATCAACTTTATC 

TGGATTCAACAGGTAACCTGACTAAAAACCAGGCCGGCAACCCGAAAAAAGCAACCGTTT 

CTGGGCTTCTCGGAAATACGGATGCGAAAGGTACTGCTGTTAAAACAACCATCAAGACAG 

AGGCTGGTGTAACAGTTACAGCTGAAGGTAATACAGGTACTGTAAAAATTGAAGGTGCTA 

CTGTTTCAGCATCTGCATTTACGGGCATTGCATATTCCGCCAACACCGGTGGGAATACTT 

ATGCTGTTGCCGCAAATAATACTACAAATGGTTTCCTGGCGGGGGATGACTTAACCCAGG 

ATGCTCAAACTGTTTCAACCTACTACTCGCAAGCCGATGGCACGGTCACGAATAGCGCAG 

GCAAAGAAATCTATAAAGACGCTGATGGTGTCTACAGCACAGAGAATAAAACATCGAAGA 

CGTCCGATCCATTGGCTGCGCTTGACGACGCAATCAGCTCCATCGACAAATTCCGTTCAT 

CCTTGGGTGCTATCCAGAACCGTCTGGATTCCGCGGTCACCAACCTGAACAACACCACTA 

CCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCA 

ACATGTCGAAAGCGCAGATCATCCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCTAACC 

AGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGCTAA 
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AACAAATCTCAGTCTTCTCTGAGCTCCGCCATTGAACGTCTCTCTTCTGGCCTGCGTA 

TTAACAGTGCTAAAGATGACGCAGCAGGTCAGGCGATTGCTAACCGTTTTACAGCAAATA 

TTAAAGGTCTGACTCAGGCTTCCCGTAACGCGAATGATGGTATTTCTGTTGCGCAGACCA 

CTGAAGGTGCGCTGAATGAAATTAACAACAACCTGCAGCGTGTACGTGAACTGACTGTTC 

AGGCAACTAACGGTACTAACTCTGACAGCGATCTTTCTTCTATCCAGGCTGAAATTACTC 

AACGTCTGGAAGAAATTGACCGTGTATCTGAGCAAACTCAGTTTAACGGCGTGAAAGTCC 

TTGCTGAAAATAATGAAATGAAAATTCAGGTTGGTGCTAATGATGGTGAAACCATCACTA 

TCAATCTGGCAAAAATTGATGCGAAAACTCTCGGCCTGGACGGTTTTAATATCGATGGCG 

CGCAGAT^GCAACTGGCAGTGACCTGATTTCTAAATTTAAAGCGACAGGTACTGATAACT 

ATGATGTTGGCGGTGATGCTTATACTGTTAACGTAGATAGCGGAGCTGGGTAATGACTCC 

AACTTATTGATAGTGTTTTATGTTCAGATAATGCCCGATGACTTTGTCATGCAGCTCCAC 

CGATTTTGAGAACGACAGCGACTTCCGTCCCAGCCGTGCCAGGTGCTGCCTCAGATTCAG 

GTTATGCCGCTCAATTCGCTGCGTATATCGCTTGCTGATTACGTGCAGCTTTCCCTTCAG 

GCGGGATTCATACAGCGGCCAGCCATCCGTCATCCATATCACCACGTCAAAGGGTGACAG 

CAGGCTCATAAGACGCCCCAGCGTCGCCATAGTGCGTTCACCGAATACGTGCGCAACAAC 

CGTCTTCCGGAGCCTGTCATACGCGTAAAACAGCCAGCGCTGGCGCGATTTAGCCCCGAC 

ATAGTCCCACTGTTCGTCCATTTCCGCGCAGACGATGACGTCACTGCCCGGCTGTATGCG 

CGAGGTTACCGACTGCGGCCTGAGTTTTTTAAGTGACGTAAAATCGTGTTGAGGCCAACG 

CCCATAATGCGGGCAGTTGCCCGGCATCCAACGCCATTCATGGCCATATCAATGATTTTC 

TGGTGCGTACCGGGTTGAGAAGCGGTGTAAGTGAACTGCAGTTGCCATGTTTTACGGCAG 

TGAGAGCAGAGATAGCGCTGATGTCCGGCGGTGCTTTTGCCGTTACGCACCACCCCGTCA 

GTAGCTGAACAGGAGGGACAGCTGATAGAAACAGAAGCCACTGGAGCACCTCAAATACAC 

CATCATACACTAAATCAGTAAGTTGGCAGCATTACCGCGGAGCTGTTAAAGATACTACAG 

GGAATGATATTTTTGTTAGTGCAGCAGATGGTTCACTGACAACTAAATCTGACACAAACA 

TAGCTGGTACAGGGATTGATGCTACAGCACTCGCAGCAGCGGCTAAGAATAAAGCACAGA 

ATGATAAATTCACGTTTAATGGAGTTGAATTCACAACAACAACTGCAGCGGATGGCAATG 

GGAATGGTGTATATTCTGCAGAAATTGATGGTAAGTCAGTGACATTTACTGTGACAGATG 

CTGACATUVAAAGCTTCTTTGATTACGAGTGAGACAGTTTACAAAAATAGCGCTGGCCTTT 

ATACGACAACCAAAGTTGATAACAAGGCTGCCACACTTTCCGATCTTGATCTCAATGCAG 

CTAAGAAAACAGGAAGCACGTTAGTTGTTAACGGTGCAACTTACGATGTTAGTGCAGATG 

GTAAAACGATAACGGAGACTGCTTCTGGTAACAATAAAGTCATGTATCTGAGCAAATCAG 

AAGGTGGTAGCCCGATTCTGGTAAACGAAGATGCAGCAAAATCGTTGCAATCTACCACCA 

ACCCGCTCGAAACTATCGACAAAGCATTGGCTAAAGTTGACAATCTGCGTTCTGACCTCG 

GTGCAGTACAAAACCGTTTCGACTCTGCTATCACCAACCTTGGCAACACCGTAAACAACC 

TGTCTTCTGCCCGTAGCCGTATCGAAGATGCTGACTACGCGACCGAAGTGTCTAACATGT 

CTCGTGCGCAGATCCTGCAACAAGCGGGTACCTCTGTTCTGGCGCAG 
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AACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGT 

CTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACC 
GTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTT 
CTGTTGCGCAGACCACCGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTGTGC 
GTGAACTGACCGTTCAGGCAACCACCGGTACCAACTCCCAGTCTGACCTGGACTCTATCC 
AGGACGAAATTAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGTCAGACCCAGTTCA 
ACGGCGTGAACGTACTGGCAAAAGACGGTTCCATGAAAATTCAGGTTGGCGCGAACGATG 
GCCAGACCATCACTATCGACCTGAAGAAGATTGACTCTTCTACGCTGAAACTGACTGGTT 
TTAACGTGAATGGCAAAGCAGCGGTTGATAATGCTAAAGCGACGGATGCAAATCTGACTA 
CCGCCGGTTTTACACAAGGCGTTGTGGATTCAAATGGTAATAGTACTTGGACTAAATCAA 
CTACGACTAATTTCGATGCGGCAACTGCAGTAAACGTACTAGCAGCAGTTAAAGATGGCA 
GCACAATCAATTACACCGGTACTGGTAATGGTTTAGGGATTGCTGCAACAAGTGCTTATA 
CATATCACGATAGCACTAAATCCTATACCTTTGATTCTACGGGGGCTGCAGTAGCTGGTG 
CCGCGTCCAGCCTGCAAGGTACTTTTGGTACAGATACGAATACTGCAAAAATCACCATCG 
ATGGTTCTGCTCAAGAAGTAAACATCGCTAAAGATGGGAAAATTACTGATACTGATGGTA 
AAGCTTTATATATCGATTCCACTGGTAATTTGACTAAGAACGGCTCTGATACTTTAACTC 
AGGCAACATTGAATGATGTCCTTACTGGTGCTAATTCAGTTGATGATACAAGGATTGACT 
TCGATAGCGGCATGTCTGTCACCCTTGATAAAGTGAACAGCACTGTAGATATCACTGGCG 
, CATCTATTTCAGCCGCTGCAATGACTAATGAGTTGACAGGTAAGGCCTATACCGTAGTAA 
ATGGTGCAGAATCTTACGCTGTAGCTACTAATAACACAGTAAAAACGACTGCTGATGCTA 
AAAATGTTTATGTTGATGCTAGTGGTAAATTAACTACTGATGACAAAGCCACTGTTACAG 
AAACTTATCATGAATTTGCGAATGGCAATATCTATGATGATAAAGGCGCTGCTGTTTATG 
CGGCGGCGGATGGTTCTCTGACTACAGAAACTACAAGTAAATCAGAAGCTACAGCTAACC 
CGCTGGCCGCTCTGGACGACGCAATCAGCCAGATCGACAAATTCCGTTCATCCCTGGGTG 
CTATCCAGAACCGTCTGGATTCCGCAGTCACCAACCTGAACAACACCACTACCAATCTGT 
CTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGTCGA 
AAGCGCAGATCATCCAGCAGGCAGGCAACTCCGTGCTGGCAAAA 



Figure 40 





wo 99/61458 



PCT/AU99/00385 



63/96 



AACAAAAACCAGTCTGCGCTGTCGACTTCTATCGAGCGCCTCTC 

TTCTGGTCTGCGCATTAACAGCGCTAAAGATGACGCTGCGGGCCAGGCGATTGCTAACCG 

GTTCACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAACGACGGTATCTC 

TCTGGCGCAGACCACTGAAGGCGCACTGTCTGAAATCAACAACAACTTGCAGCGTGTTCG 

TGAACTGACCGTTCAGGCCACTACCGGTACTAACTCTGATTCTGACCTGTCTTCAATCCA 

GGACGAAATCAAATCCCGTCTCGATGAAATTGACCGCGTATCCGGTCAGACTCAGTTCAA 

CGGCGTGAACGTACTGGCAAAAGATGGCTCGATGAAAATTCAGGTCGGTGCAAATGATGG 

TCAGACAATCAGCATTGATTTGCAGAAGATTGATTCTTCTACTTTAGGGTTAAATGGTTT 

TTCTGTTTCCAAAAATGCAGTATCTGTTGGTGATGCTATTACTCAATTGCCTGGCGAGAC 

GGCAGCCGATGCACCAGTAACCATCAAGTTTGATGATTCAGTAAAAACTGATTTAAAACT 

GACCGATGCTTCAGGGTTAAGTCTGCATAACCTCAAAGATGAAAATGGTAATTTAACTAA 

CCAGTATGTTGTACAGAATGGCGGAAAATCTTACGCTGCTACAGTCGCTGCCAATGGTAA 

TGTTACGCTGAACAAAGCAAATGTAACCTACAGCGATGTCGCAAACGGTATTGATACCGC 

AACGCAGTCAGGCCAGTTAGTTCAGGTTGGTGCAGATTCTACCGGTACGCCAAAAGCATT 

CGTGTCTGTCCAAGGTAAAAGCTTTGGCATTGATGACGCCGCCTTGAAGAATAACACTGG 

TGATGCTACCGCTACTCAACCGGGAACATCTGGGACAACAGTTGTCGCAGCGTCAATTCA 

TCTGAGTACGGGCAAAAACTCTGTAGACGCTGATGTAACGGCTTCCACTGAATTCACAGG 

TGCTTCAACCAACGATCCACTGACTCTGCTGGACAAAGCTATCGCATCTGTTGATAAATT 

CCGTTCTTCTTTGGGGGCGGTACAGAACCGTCTGAGCTCCGCTGTAACCAACCTGAAC;^ 

CACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGA 

AGTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCAGGTAACTCCGTGCTGTCCAA A 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTATCGAGCGCCTCTCTTCTGGTC 

TGCGCATTAACAGCGCTAAAGATGACGCTGCGGGCCAGGCGATTGCTAACCGCTTCACTT 

CTAACATCAAAGGTCTGACTCAGGCTGCACGTAACGCCAATGACGGTATTTCTCTAGCAC 

AGACAGCGGAAGGCGCGCTGTCAGAGATTAACAACAACTTGCAGCGTGTGCGTGAGTTGA 

CCGTGCAGGCAACCACTGGTACCAACTCTGATTCCGATCTCTCTTCTATTCAGGATGAAA 

TTAAATCTCGTCTGGATGAAATTGACCGCGTCTCTGGTCAGACCCAGTTTAACGGCGTGA 

ACGTACTGGCTAAAAACGGTTCTATGGCAATTCAGGTTGGCGCGAACGATGGCCAGACTA 

TCTCTATCGACCTGCAGAAAATAGACTCTTCTACTCTGGGTCTGAGCGGCTTCTCTGTTT 

CTCAGAACTCCCTGAAACTGAGCGATTCTATCACTACGATCGGCAATACTACTGCTGCAT 

CGAAGAACGTGGACCTGAGCGCAGTAGCAACTAAACTGGGCGTGAATGCAAGCACCCTGA 

GCCTGCACGAAGTTCAGGACTCTGCTGGTGACGGTACTGGTACCTTCGTTGTTTCTTCTG 

GCAGCGACAACTATGCTGTGTCTGTAGACGCGGCCTCTGGTGCAGTTAACCTGAACACCA 

CTGACGTCACCTATGATGACGCTACTAATGGTGTTACTGGCGCGACTCAGAACGGTCAGC 

TGATCAAAGTAACTTCTGACGCCAACGGTGCAGCTGTTGGTTACGTAACCATTCAGGGTA 

AAAACTATCAGGCTGGTGCGACCGGTGTTGACGTTCTGGCGAACAGCGGTGTTGCAGCTC 

CAACTACAGCTGTTGATACCGGTACTCTGCAACTGAGCGGTACTGGTGCAACTACTGAGC 

TGAAAGGTACTGCAACTCAGAACCCACTGGCACTATTGGACAAAGCTATCGCTTCTGTTG 

ATAAATTCCGTTCTTCTCTGGGTGCGGTACAGAATCGTCTGAGCTCTGCTGTAACCAACC 

TGAATAACACCACCACTAACCTGTCTGAAGCGCAGTCCCGTATTCAGGATGCCGACTATG 

CGACCGAAGTGTCAAATATGTCTAAAGCGCAGATCGTTCAGCAGGCCGGTAAC 
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AACAAATCTCAGTCTTCTCTTAGCTCTGCTATTGAGCGTCTGTCTTCT 

GGTCTGCGTATTAACAGCGCAAAAGACGATGCAGCAGGTCAGGCGATTGCTAACCGTTTT 

ACGGCAAATATTAAAGGTCTGACCCAGGCTTCCCGTAACGCAAATGATGGTATTTCTGTT 

GCGCAGACCACTGAAGGTGCGCTGAATGAAATTAACAACAACCTGCAGCGTATTCGTGAA 

CTTTCTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATCTTTCTTCTATCCAGGCT 

GAAATTACTCAACGTCTGGAAGAAATTGACCGTGTATCTGAGCAAACTCAGTTTAACGGC 

GTGAAAGTCCTTGCTGAAAATAATGAAATGAAAATTCAGGTTGGTGCTAATGATGGTGAA 

ACCATCACTATCAATCTGGCAAAAATTGATGCGAAAACTCTCGGCCTGGACGGTTTTAAT 

ATCGATGGCGCGCAGAAAGCAACAGGCAGTGACCTGATTTCTAAATTTAAAGCGACAGGT 

ACTGATAATTATGATGTTGGCGGTAAAACTTATACCGTGAATGTGGAGAGCGGCGCGGTT 

AAGAATGATGCTJ\ATAAAGATGTTTTTGTAAGCGCAGCTGATGGATCGCTGACGACCAGT 

AGTGATACTAAAGTATCCGGTGAAAGTATTGATGCAACAGAACTAGCGAAACTTGCAATA 

AAATTAGCTGACAAAGGCTCCATTGAATACAAGGGCATTACATTTACTAACAACACTGGC 

GCAGAGCTTGATGCTAATGGTAAAGGTGTTTTGACCGCAAATATTGATGGTCAAGATGTT 

CAATTTACTATTGACAGTAATGCACCCACGGGTGCCGGCGCAACAATAACTACAGACACA 

GCTGTTTACAAAAACAGTGCGGGCCAGTTCACCACTACAAAAGTGGAAAATAAAGCCGCA 

ACACTCTCTGATCTGGATCTTAATGCAGCCAAGAAAACAGGTAGCACTTTAGTTGTAAAT 

GGCGCCACCTACAATGTCAGCGCAGATGGTAAAACGGTAACTGATACTACTCCTGGTGCC 

CCTAAAGTGATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGATTCTGGTAAACGAAGAT 

GCAGCAAAATCGTTGCAATCTACCACCAACCCGCTCGAAACTATCGACAAGGCATTGGCT 

AAAGTTGACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGACTCTGCCATC 

ACCAACCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGCCGTATCGAAGATGCT 

GACTACGCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCCTGCAACAAGCGGGTACC 

TCTGTTCTGGCGCAG 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACT 

CAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCT 

GGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTC 

ACCTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGTT 

GCACAGACCACCGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTGAA 

CTGACGGTTCAGGCTTCTACCGGGACTAACTCTGATTCGGATCTGGACTCCATTCAGGAC 

GAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGCCAGACCCAGTTCAACGGC 

GTGAACGTGCTGGCGAAAGACGGTTCAATGAAAATTCAGGTTGGTGCGAATGACGGCCAG 

ACTATCACTATTGATCTGAAGAAAATTGACTCTGATACTCTGGGTTTGAGTGGATTTAAT 

GTGAATGGCAAAGGGGCTGTGGCTAACGCAAAAGCGACCGAAGCAGATTTAACGGGGGCT 

GGTTTCTCTCAAGGAGCGGTGGATACAAACGGAAATAGTACTTGGACAAAATCAACCACC 

ACCAATTACTCAGCTGCAACT^CTGCTGACTTGTTATCGACCATTAAGGATGGCTCTACT 

GTTACATATGCAGGGACAGACACCGGATTAGGGGTCGCAGCAGCAGGAAATTATACTTAT 

GATGCGAACAGTAAATCTTATTCCTTCAATGCCAATGGTCTGACGGGCGCAAATACCGCA 

ACTGCACTCAAAGGTTACTTGGGGACAGGTGCTAACACCGCTAAAATTTCTATCGGTGGT 

ACAGAGCAGGAAGTGAATATTGCCAAAGATGGCACTATTACAGATACGAATGGTGATGCG 

CTCTATCTGGATATTACCGGCAACCTGACTAAGAACTATGCGGGTTCACCACCTGCAGCA 

ACGCTGGATAACGTATTAGCTTCCGCAACTGTAAATGCCACTATCAAGTTTGATAGCGGT 

ATGACGGTTGATTACACTGCAGGTACTGGCGCGAATATTACAGGTGCATCCATTTCTGCA 

GATGACATGGCCGCAAAACTGAGCGGAAAGGCGTACACTGTTGCCAATGGTGCTGAGTCT 

TATGACGTTGCTGCAGTTACGGGGGCTGTAACAACTACAGCAGGTAATTCACCTGTGTAT 

GCCGATGCAGACGGTAAATTAACGACGAGTGCCAGTAATACGGTTACTCAGACTTATCAC 

GAGTTTGCTAATGGTAACATTTATGATGACAAAGGCTCGTCACTGTATAAAGCTGCAGAT 

GGCTCTCTGACTTCTGAAGCTAAAGGGAAATCTGAAGCAACCGCCGATCCCCTGAAAGCT 

CTGGACGAAGCCATCAGCTCCATCGACAAATTCCGCTCCTCCCTCGGTGCCGTTCAAAAC 

CGTCTGGATTCTGCGGTGACCAACCTGAACAACACCACTACCAACCTGTCTGAAGCGCAG 

TCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGTCGAAAGCGCAGATC 

ATCCAGCAGGCCGGTAACTCCGTGTTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTG 

TCTCTGCTGCAGGGTTAA 
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GCGCTGTCGACTTCTATCGAGCGCCTCTCTTCTGGTTTGCGCATTAACAGCGCTA 

AAGATGACGCTGCGGGCCAGGCGATTGCTAACCGCTTCACTTCTAACATCAAAGGTCTGA 

CTCAGGCCGCACGTAACGCCAACGACGGTATCTCTCTGGCGCAGACCACTGAAGGCGCAC 

TGTCTGAAATCAACAACAACTTGCAGCGTGTTCGTGAACTGACCGTTCAGGCCACTACCG 

GTACTAACTCTGATTCTGACCTGTCTTCAATCCAGGACGAAATCi^AATCCCGCTTGGCTG 

AAATCGATCGTGTCTCTGGTCAGACCCAGTTCAACGGCGTGAACGTGCTGGCTAAAAACG 

GTTCTCTGAATATTCAGGTTGGCGCGAATGATGGGCAGACCATCTCTATCGATTTGCAGA 

AAATAGACTCTTCTGCCCTTGGTTTAAGTGGTTTTAGTGTTGCCGGTGGGGCGCTAAAAT 

TAAGCGATACAGTGACGCAGGTCGGCGATGGTTCAGCCGCGCCAGTTAAAGTGGATCTGG 

ATGCAGCAGCAACAGATATTGGTACTGCTTTGGGGCAAAAGGTTAATGCAAGTTCTTTAA 

CGTTGCACAATATCTTAGACT^AAGATGGTGCGGCAACTGAGAACTATGTTGTTAGCTATG 

GTAGTGATAATTACGCTGCATCTGTTGCAGATGACGGGACTGTAACTCTTAATAAAACGG 

ATATTACTTATTCAGGCGGTGATATTACCGGCGCTACCAAAGATGATACGTTGATTAAAG 

TTGCTGCTAATTCTGACGGAGAGGCCGTTGGTTTCGCTACCGTTCAGGGTAAGAATTATG 

AAATTACAGATGGTGTAAAAAACCAGTCCACTGCTGCACCAACCGATATTGCTCAGACCA 

TTGATCTGGATACGGCTGATGAATTTACTGGGGCTTCCACTGCTGATCCACTGGCACTTT 

TAGACAAAGCTATTGCACAGGTTGATACTTTCCGCTCCTCCCTCGGTGCCGTTCAAAACC 

GTCTGGATTCCGCAGTCACCAACCTGAACAACACTACTACCAACCTGTCTGAAGCGCAGT 

CCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGTCGAAAGCGCAGATCA TCCAGCAGGCC 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACT 

CAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCT 

GGCTTGCGTATTAACAGCGCGAAGGATGACGCAGCGGGTCAGGCGATTGCTAACCGTTTT 

ACTTCTAATATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAATGACGGTATTTCTCTG 

GCGCAGACCACTGAAGGCGCACTGTCTGAAATCAACAACAACTTGCAGCGTGTGCGTGAA 

CTGACCGTACAGGCGACAACCGGAACGAACTCCGAATCTGACCTGTCCTCTATCCAGGAC 

GAAATCAAATCCCGTCTGGAAGAGATTGACCGCGTATCCGGCCAGACTCAGTTCAACGGC 

GTGAATGTGCTGGCAAAAGACGGCACCATGAAAATTCAGGTAGGCGCGAACGATGGTCAG 

ACTATCTCTATCGATCTGAAAAAAATCGACTCTTCAACCCTGGGCCTGACCGGTTTTGAT 

GTTTCGACGAAAGCGAATATTTCTACGACAGCAGTAACGGGGGCGGCAACGACCACTTAT 

GCTGATAGCGCCGTTGCAATTGATATCGGAACGGATATTAGCGGTATTGCTGCTGATGCT 

GCGTTAGGAACGATCAATTTCGATAATACAACAGGCAAGTACTACGCACAGATTACCAGT 

GCGGCCAATCCGGGCCTTGATGGTGCTTATGAAATCCATGTTAATGACGCGGATGGTTCC 

TTCACTGTAGCAGCGAGTGATAAACAAGCGGGTGCTGCTCCGGGTACTGCTCTGACAAGC 

GGTAAAGTTCAGACTGCAACCACCACGCCAGGTACGGCTGTTGATGTCACTGCGGCTAAA 

ACTGCTCTGGCTGCAGCAGGTGCTGACACGAGTGGCCTGAAACTGGTTCAACTGTCCAAC 

ACGGATTCCGCAGGTAAAGTGACCAACGTGGGTTACGGCCTGCAGAATGACAGCGGCACT 

ATCTTTGCAACCGACTACGATGGCACCACTGTGACCACGCCGGGCGCAGAGACTGTGACT 

TACAAAGATGCTTCCGGTAACAGCACCACTGCGGCTGTCACACTGGGTGGCTCTGATGGC 

AAAACCAATCTGGTTACCGCCGCTGACGGCAAAACGTACGGTGCGACTGCACTGAATGGT 

GCTGATCTGTCCGATCCTAATAACACCGTTAAATCTGTTGCAGACAACGCTAAACCGTTG 

GCTGCCCTGGATGATGCAATTGCGATGGTCGACAAATTCCGCTCCTCCCTCGGTGCGGTG 

CAAAACCGTCTGGATTCCGCAGTCACCAACCTGAACAACACCACTACCAACCTGTCTGAA 

GCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCG 

CAGATTATCCAGCAGGCAGGTAACTCCGTGCTGTCCAAAGCTAACCAGGTTCCGCAGCAG 

GTTCTGTCTCTGCTGCAGGGTTAA 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTATCGAGCGCCTCTCTTCTGGT 

CTGCGTATTAACAGCGCTAAAGATGACGCCGCGGGCCAGGCGATTGCTAACCGCTTTACT 

TCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAACGACGGTATTTCTCTGGCG 

CAGACGGCTGAAGGCGCGCTGTCAGAGATTAACAACAACTTGCAGCGTATTCGTGAACTG 

ACCGTTCAGGCCTCTACCGGCACGAACTCTGATTCCGACCTGTCTTCTATTCAGGACGAA 

ATCAAATCCCGTCTTGATGAAATTGACCGTGTATCTGGTCAGACCCAGTTCAACGGTGTG 

AACGTGCTGTCGAAAAACGATTCGATGAAGATTCAGATTGGTGCCAATGATAACCAGACG 

ATCAGCATTGGCTTGCAACAAATCGACAGTACCACTTTGAATCTGAAAGGATTTACCGTG 

TCCGGCATGGCGGATTTCAGCGCGGCGAAACTGACGGCTGCTGATGGTACAGCAATTGCT 

GCTGCGGATGTCAAGGATGCTGGGGGTAAACAAGTCAATTTACTGTCTTACACTGACACC 

GCGTCTAACAGTACTAAATATGCGGTCGTTGATTCTGCAACCGGTAAATACATGGAAGCC 

ACTGTAGCCATTACCGGTACGGCGGCGGCGGTAACTGTTGGTGCAGCGGAAGTGGCGGGA 

GCCGCTACAGCCGATCCGTTAAAAGCACTGGATGCCGCAATCGCTAAAGTCGACAAATTC 

CGCTCCTCCCTCGGTGCCGTTCAAAACCGTCTGGATTCTGCGGTCACCAACCTGAACAAC 

ACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAA 

GTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGCAAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTC 

AAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTG 

GCTTGCGTATTAACAGCGCGAAGGATGACGCAGCGGGTCAGGCGATTGCTAACCGTTTTA 

CCTCTAACATTAAAGGTCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGTTG 

CACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTGAAC 

TGACGGTTCAGGCTTCTACCGGGACTAACTCCGATTCGGATCTGGACTCCATTCAGGACG 

AAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGTCAAACCCAGTTCAACGGTG 

TGAACGTACTGGCGAAAGACGGTTCGATGAAAATTCAGGTTGGTGCGAATGACGGCCAGA 

CTATCACGATTGATCTGAAGAAAATTGACTCAGATACGCTGGGGCTGAATGGTTTCAACG 

TTAATGGCAAAGGCACTATTGCGAACAAAGCTGCTACAGTCAGCGATCTGACCGCTGCTG 

GTGCAACGGGAACAGGTCCTTATGCTGTGACCACAAACAATACAGCACTCAGCGCTAGCG 

ATGCACTGTCTCGCCTGAAAACCGGAGATACAGTTACTACTACTGGCTCGAGTGCTGCGA 

TCTATACTTATGATGCGGCTAAAGGGAACTTCACCACTCAAGCAACAGTTGCAGATGGCG 

ATGTTGTTAACTTTGCGAATACTCTGAAACCAGCGGCTGGCACTACTGCATCAGGTGTTT 

ATACTCGTAGTACTGGTGATGTGAAGTTTGATGTAGATGCTAATGGCGATGTGACCATCG 

GTGGTAAAGCCGCGTACCTGGACGCCACTGGTAACCTATCTACAAACAACCCCGGCATTG 

CATCTTCAGCGAAATTGTCCGATCTGTTTGCTAGCGGTAGTACCTTAGCGACAACTGGTT 

CTATCCAGCTGTCTGGCACAACTTATAACTTTGGTGCAGCGGCAACTTCTGGCGTAACCT 

ACACCAAAACTGTAAGCGCTGATACTGTACTGAGCACAGTGCAGAGTGCTGCAACGGCTA 

ACACAGCAGTTACTGGTGCGACAATTAAGTATAATACAGGTATTCAGTCTGCAACGGCGT 

CCTTCGGTGGTGTGAATACTAATGGTGCTGGTAATTCGAATGACACCTATACTGATGCAG 

ACAAAGAGCTCACCACAACCGCATCTTACACTATCAACTACAACGTCGATAAGGATACCG 

GTACAGTAACTGTAGCTTCAAATGGCGCAGGTGCAACTGGTAAATTTGCAGCTACTGTTG 

GGGCACAGGCTTATGTTAACTCTACAGGCAAACTGACCACTGAAACCACCAGTGCAGGCA 

CTGCAACCAAAGATCCTCTGGCTGCCCTGGATGAAGCTATCAGCTCCATCGACAAATTCC 

GTTCATCCCTGGGTGCTATCCAGAACCGTCTGGATTCCGCGGTTACCAACCTGAACAACA 

CCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAG 

TGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGCAAAAG 

CCAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCAC 

TCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTC 

TGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTT 

TACTTCTAATATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAATGACGGTATTTCTGT 

TGCACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTGTGCGTGA 

ACTGACCGTTCAGGCGACCACCGGTACCAACTCCCAGTCTGATCTGGACTCTATCCAGGA 

CGAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGTCAGACTCAGTTCAACGG 

CGTGAACGTACTGGCAAAAGACGGTTCCATGAAAATTCAGGTTGGCGCGAATGATGGCCA 

GACCATCACTATCGACCTGAAGAAGATTGACTCTTCTACGTTGAAACTGACTGGTTTTAA 

CGTGAATGGTTCTGGTTCTGTGGCGAATACTGCGGCGACTAAAGACGAACTGGCTGCTGC 

TGCTGCGGCGGCGGGTACAACTCCTGCTGTCGGTACTGACGGCGTGACCAAATATACCGT 

AGACGCAGGGCTTAACAAAGCCACAGCAGCAAACGTGTTTGCAAACCTTGCAGATGGTGC 

TGTTGTTGATGCTAGCATTTCCAACGGTTTTGGTGCAGCAGCAGCCACAGACTACACCTA 

CAATAAAGCTACAAATGATTTCACTTTCAATGCCAGCATTGCTGCTGGTGCTGCGGCCGG 

TGATAGTAACAGCGCAGCTCTGCAATCCTTCCTGACTCCAAAAGCAGGTGATACAGCT7VA 

CCTGAGCGTCAAAATCGGTACGACATCTGTTAATGTTGTTCTGGCGAGCGATGGCAAAAT 

TACAGCGAAAGATGGCTCAGCTCTGTATATCGACTCAACGGGTAACCTGACTCAGAACAG 

CGCAGGCACTGTAACAGCAGCAACCCTGGATGGACTGACCAAAAACCATGATGCGACAGG 

AGCTGTTGGTGTTGATATCACGACCGCAGATGGCGCAACTATCTCTCTGGCAGGCTCTGC 

TAACGCGGCAACAGGTACTCAATCAGGTGCAATTACACTGAAAAATGTTCGTATCAGTGC 

TGATGCTCTGCAGTCTGCTGCGAAAGGTACTGTTATCAATGTTGATAATGGTGCTGATGA 

TATTTCTGTTAGTAAAACCGGGTGTCGTTACTACCGGAGGTGCGCCTACTTATACTGATG 

CTGATGGTAAATTAACGACAACCAACACCGTTGATTATTTCCTGCAAACTGATGGCAGCG 

TAACCAATGGTTCTGGTAAAGGGGTTTACACCGATGCAGCTGGTAAATTCACTACCGACG 

CTGCAACCAAAGCCGCAACCACCACCGATCCGCTGAAAGCCCTTGATGACGCAATCAGCC 

AGATCGATAAGTTCCGTTCATCCCTGGGTGCTATCCAGAACCGTCTGGATTCCGCGGTTA 

CCAACCTGAACAACACCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGCCG 

ACTATGCGACCGAAGTGTCCAATATGTCGAAAGCGCAGATCATCCAGCAGGCCGGTAACT 

CCGTGTTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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AACTUU^TCTCAGTCTTCTCTTAGCTCTGCTATTGAGCGTCTGTCTTCTGGT 

CTGCGTATT7UVCAGCGCAAAAGACGATGCAGCAGGTCAGGCGATTGCTAACCGTTTTACG 

GCAAATATTAAAGGTCTGACCCAGGCTTCCCGTAACGCGAATGATGGTATTTCTGTTGCG 

CAGACCACTGAAGGTGCGCTGAATGAAATTAACAACAACCTGCAGCGTATTCGTGAACTT 

TCTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATCTTTCTTCTATCCAGGCTGAA 

ATTACTCAACGTCTGGAAGAAATTGACCGTGTATCTGAGCAAACTCAGTTTAACGGCGTG 

AAAGTCCTTGCTGAAAATAATGAAATGAAAATTCAGGTTGGTGCTAATGATGGTGAAACC 

ATCACTATCAATCTGGCAAAAATTGATGCGAAAACTCTCGGCCTGGACGGTTTTAATATC 

GATGGCGCGCAGAAAGCAACCGGCAGTGACCTGATTTCTAAATTTAAAGCGACAGGTACT 

GATAATTATCAAATTAACGGTACTGATAACTATACTGTTAATGTAGATAGTGGAGTAGTA 

CAGGATAAAGATGGCA/U^CAAGTTTATGTGAGTGCTGCGGATGGTTCACTTACGACCAGC 

AGTGATACTCAATTCAAGATTGATGCAACTAAGCTTGCAGTGGCTGCTAAAGATTTAGCT 

CAAGGTAATAAGATTGTCTACGAAGGTATCGAATTTACAAATACCGGCACTGGCGCTATA 

CCTGCCACAGGTAATGGTGAATTAACCGCCTU^TGTTGATGGTAAGGCTGTTGAATTCACT 

ATTTCGGGGAGTGCTGATACATCAGGTACTAGTGCAACCGTTGCCCCTACGACAGCCCTA 

TACAAAAATAGTGCAGGGCAATTGACTGCAAC7UUUVGTTGAAAATAAAGCAGCGACACTA 

TCTGATCTTGATCTGAACGCTGCCAAGAAAACAGGAAGCACGTTAGTTGTTAACGGTGCA 

ACTTACGATGTTAGTGCAGATGGTAAAACGATAACGGAGACTGCTTCTGGTAACAATAAA 

GTCATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGATTCTGGTAAACGAAGATGCAGCA 

AAATCGTTGCAATCTACCACCAACCCGCTCGAAACTATCGACAAAGCATTGGCTAAAGTT 

GACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGACTCTGCCATCACCAAC 

CTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGCCGTATCGAAGATGCTGACTAC 

GCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCCTGCAACAAGCGGGTACCTCTGTT CTGGCACAG 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCAC 

TCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTC 

TGGCTTGCGTATTAACAGCGCGAAGGATGACGCAGCGGGTCAGGCGATTGCTAACCGTTT 

CACCTCTAACATTAAAGGCCTGACTCAGGCGGCCCGTAACGCCAACGACGGTATCTCCGT 

TGCGCAGACCACCGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTGTGCGTGA 

ACTGACGGTACAGGCCACTACCGGTACTAACTCTGAGTCTGATCTGTCTTCTATCCAGGA 

CGAAATTTIAATCCCGTCTGGATGAAATTGACCGCGTATCTGGTCAGACCCAGTTCAACGG 

CGTGAACGTGCTGGCAAAAAATGGCTCCATGAAAATCCAGGTTGGCGCAAATGATAACCA 

GACTATCACTATCGATCTGAAGCAGATTGATGCTAAAACTCTTGGCCTTGATGGTTTTAG 

CGTTAAAAATAACGATACAGTTACCACTAGTGCTCCAGTAACTGCTTTTGGTGCTACCAC 

CACAAACAATATTAAACTTACTGGAATTACCCTTTCTACGGAAGCAGCCACTGATACTGG 

CGGAACTAACCCAGCTTCAATTGAGGGTGTTTATACTGATAATGGTAATGATTACTATGC 

GAAAATCACCGGTGGTGATAACGATGGGAAGTATTACGCAGTAACAGTTGCTAATGATGG 

TACAGTGACAATGGCGACTGGAGCAACGGCAAATGCAACTGTAACTGATGGAAATACTAC 

TAAAGCTACAACTATCACTTCAGGCGGTACACCTGTTCAGATTGATAATACTGCAGGTTC 

CGCAACTGCCAACCTTGGTGCTGTTAGCTTAGTAAAACTGCAGGATTCCAAGGGTAATGA 

TACCGATACATATGCGCTTAAAGATACAAATGGCAATCTTTACGCTGCGGATGTGAATGA 

AACTACTGGTGCTGTTTCTGTTAAAACTATTACCTATACTGACTCTTCCGGTGCCGCCAG 

TTCTCCAACCGCGGTCAAACTGGGCGGAGATGATGGCAAAACAGAAGTGGTCGATATTGA 

TGGTAAAACATACGATTCTGCCGATTTAAATGGCGGTAATCTGCAAACAGGTTTGACTGC 

TGGTGGTGAGGCTCTGACTGCTGTTGCAAATGGTAAAACCACGGATCCGCTGAAAGCGCT 

GGACGATGCTATCGCATCTGTAGACAAATTCCGTTCTTCCCTCGGTGCGGTGCAAAACCG 

TCTGGATTCCGCGGTTACCAACCTGAACAACACCACTACCAACCTGTCTGAAGCGCAGTC 

CCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGTCGAAAGCGCAGATCAT 

CCAGCAGGCCGGTAACTCCGTGTTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTGTC 

TCTGCTGCAGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACT 

CAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCT 

GGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTT 

ACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGTT 

GCGCAGACCACCGAAGGCGCGCTGTCTGAAATCAACAACAACTTACAGCGTATTCGTGAA 

CTGACGGTTCAGGCTTCTACCGGGACTAACTCTGATTCGGATCTGGACTCCATTCAGGAC 

GAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGTCAAACCCAGTTCAACGGT 

GTGAACGTACTGGCGAAAGACGGTTCGATGAAAATTCAGGTTGGTGCGAATGACGGCCAG 

ACTATCACTATTGATCTGAAGAAAATTGACTCTGATACGCTGGGGCTGAATGGTTTTAAC 

GTTAACGGCAAAGGTACTATTGCGAACAAAGCGGCAACCATTAGTGATCTGGCGGCGACG 

GGGGCGAATGTTACTAACTCAAGCAATATTGTTGTCACGACAAAGTTCAATGCCTTGGAT 

GCAGCGACTGCATTTAGCAAACTCAAAGATGGTGATTCTGTTGCCGTTGCTGCTCAGAAA 

TATACTTATAACGCATCGACCAATGATTTTACGACAGAAAATACAGTAGCGACAGGCACT 

GCAACGACAGATCTTGGCGCTACTCTGAAGGCTGCTGCTGGGCAGAGTCAATCAGGTACA 

TATACCTTTGCAAATGGTAAAGTTAACTTTGATGTTGATGCAAGCGGTAATATCACTATT 

GGCGGCGAAAAGGCTTTCTTGGTTGGTGGAGCGCTGACTACTAACGATCCCACCGGCTCC 

ACTCCAGCAACGATGTCTTCCCTGTTTAAGGCCGCGGATGACAAAGATGCCGCTCAATCC 

TCGATTGATTTTGGCGGGAAAAAATACGAATTTGCTGGTGGCAATTCTACTAATGGTGGC 

GGCGTTAAATTCAAAGACACGGTGTCTTCTGACGCGCTTTTGGCTCAGGTTAAAGCGGAT 

AGTACTGCTAATAATGTAAAAATCACCTTTAACAATGGTCCTCTGTCATTCACTGCATCG 

TTCCAAAATGGTGTATCTGGCTCCGCGGCATCGAATGCAGCCTACATTGATAGCGAAGGC 

GAACTGACAACTACTGAATCCTACAACACAAATTATTCCGTAGACAAAGACACGGGGGCT 

GTAAGTGTTACAGGGGGGAGCGGTACGGGTAAATACGCCGCAAACGTGGGTGCTCAGGCT 

TATGTAGGTGCAGATGGTAAATTAACCACGAATACTACTAGTACCGGCTCTGCAACCAAA 

GATCCACTAAATGCGCTGGATGAGGCAATTGCATCCATCGACAAATTCCGTTCTTCCCTG 

GGGGCTATCCAGAACCGTCTGGATTCCGCAGTCACCAACCTGAACAACACCACTACCAAC 

CTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAACATG 

TCGAAAGCGCAGATCATCCAGCAGGCCGGTAACTCCGTGTTGGCAAAAGCTAACCAGGTA 

CCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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AACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTC 

TTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCGGGTCAGGCGATTGCTAACCG 
TTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTC 
TGTTGCGCAGACCACCGAAGGCGCGCTGTCCGAAATTAACAACAACTTACAGCGTGTGCG 
TGAGCTGACTGTTCAGGCGACCACCGGTACTAACTCTGAGTCTGACCTGTCTTCTATCCA 
GGACGAAATCAAATCTCGCCTGGAAGAGATTGATCGTGTTTCAAGTCAGACTCAATTTAA 
CGGCGTGAATGTTTTGGCTAAAGATGGGAAAATGAACATTCAGGTTGGGGCAAGTGATGG 
ACAGACTATCACTATTGATCTGAAAAAGATCGATTCATCTACACTAAACCTCTCCAGTTT 
TGATGCTACAAACTTGGGCACCAGTGTTAAAGATGGGGCCACCATCAATAAGCAAGTGGC 
AGTAGATGCTGGCGACTTTAAAGATAAAGCTTCAGGATCGTTAGGTACCCTAAAATTAGT 
TGAGAAAGACGGTAAGTACTATGTAAATGACACTAAAAGTAGTAAGTACTACGATGCCGA 
AGTAGATACTAGTAAGGGTGAAATTAACTTCJ\ACTCTACAAATGAAAGTGGAACTACTCC 
TACTGCAGCGACGGAAGTAACTACTGTTGGCCGCGATGTAAAATTGGATGCTTCTGCACT 
TAAAGCCAACCAATCGCTTGTCGTGTATAAAGATAAAAGCGGCAATGATGCTTATATCAT 
TCAGACCAAAGATGTAACAACTAATCAATCAACTTTCAATGCCGCTAATATCAGTGATGC 
TGGTGTTTTATCTATTGGTGCATCTACAACCGCGCCAAGCAATTTAACAGCTGACCCGCT 
TAAGGCTCTTGATGATGCAATTGCATCTGTTGATAAATTCCGCTCTTCTCTCGGTGCCGT 
TCAGAACCGTCTGGATTCTGCCATTGCCAACCTGAACAACACCACTACCAACCTGTCTGA 
AGCGCAGTCCCGTATTCAGGACGCTGACTATGCGACCGAAGTGTCCAACATGTCGAAAGC 
GCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGCAAAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAA 

TAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTT 

GCGTATTAACAGCGCGAAGGATGACGCAGCGGGTCAGGCGATTGCTAACCGTTTCACCTC 

TAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCTAACGATGGTATCTCTCTGGCGCA 

GACCACTGAAGGCGCACTGTCTGAGATTAACAACAACTTACAACGTGTGCGTGAGTTGAC 

TGTACAGGCGACCACCGGTACTAACTCTGATTCTGACCTGGCTTCTATTCAGGACGAAAT 

CAAATCCCGTTTGTCTGAAATTGACCGCGTATCCGGGCAGACCCAGTTCAACGGCGTGAA 

CGTATTGTCTAAAGATGGCTCCCTGAAAATTCAGGTTGGCGCTVAATGATGGTCAGACTAT 

CTCTATCGACCTGAAGAAAATTGACTCTGATACTCTGGGTTTGAATGGTTTCAACGTTAA 

TGGTTCTGGTACCATTGCAAACAAAGCGGCCACAATCAGTGACTTGACTGCTCAGAAAGC 

CGTTGACAACGGTAATGGTACTTATAAAGTTACAACTAGCAACGCTGCACTTACTGCATC 

TCAGGCATTAAGTAAGCTGAGTGATGGCGATACTGTAGATATTGCAACCTATGCTGGTGG 

TACAAGTTCAACAGTTAGTTATAAATACGACGCAGATGCAGGTAACTTCAGTTATAACAA 

TACTGCAAACAAAACAAGTGCTGCGGCTGGAACTCTGGCAGATACTCTTCTCCCGGCAGC 

TGGCCAGACTAAAACCGGTACTTACAAGGCTGCTACTGGTGATGTTAACTTTAATGTTGA 

CGCAACTGGTAATCTGACAATTGGCGGACAGCAAGCCTACCTGACTACTGATGGTAACCT 

TACAACAAACAACTCCGGTGGTGCGGCTACTGCAACTCTTAAAGAGCTGTTTACTCTTGC 

TGGCGATGGTAAATCTCTGGGGAACGGCGGTACTGCTACCGTTACTCTGGATAATACTAC 

GTATAATTTCAAAGCTGCTGCGAACGTTACTGATGGTGCTGGTGTCATCGCTGCTGCTGG 

TGT7VACTTATACAGCCACTGTTTCTAAAGATGTCATTCTGGCACAACTGCAATCTGCAAG 

TCAGGCAGCAGCAACCGCTACCGACGGTGATACTGTCGCAACGATCAACTATAAATCTGG 

TGTCATGATCGGTTCCGCTACCTTTACCAATGGTAAAGGTACTGCCGATGGTATGACTTC 

TGGTACAACTCCAGTCGTAGCTACAGGTGCTAAAGCTGTATATGTTGATGGCAACAATGA 

ACTGACTTCCACTGCATCTTACGATACGACTTACTCTGTCAACGCAGATACAGGCGCAGT 

AAAAGTGGTATCAGGTACTGGTACTGGTAAATTTGAAGCTGTTGCTGGTGCGGATGCTTA 

TGTAAGCAAAGATGGCAAATTAACGACAGAAACCACCAGTGCAGGCACTGCAACCAAAGA 

TCCTTTGGCTGCCCTGGATGCTGCTATCAGCTCCATCGACAAATTCCGTTCCTCCCTGGG 

TGCTATCCAGAACCGTCTGGATTCCGCAGTCACCAACCTGAACAACACCACTACTAACCT 

GTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGTC 

GAAAGCGCAGATCATCCAGCAGGCCGGTAACTCTGTGTTGGCAAAAGCTAACCAGGTACC 

GCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCC 

TCTCGCTGATCACTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCG 

AGCGTCTGTCTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGA 

TTGCTAACCGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACG 

ACGGTATTTCTGTTGCACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTAC 

AGCGTATTCGTGAACTGACGGTTCAGGCTTCTACCGGGACTAACTCTGATTCGGATCTGG 

ACTCCATTCAGGACGAAATCAAATCCCGTCTCGACGAAATTGACCGCGTTTCCGGTCAGA 

CCCAGTTCAACGGCGTGAACGTGCTGGCGAAAGACGGTTCGATGAAGATTCAGGTTGGCG 

CGAATGACGGGCAGACCATCTCTATCGATTTGCAGAAAATTGATTCTTCAACGCTGGGAT 

TGAT^GGTTTCTCGGTATCAGGGAACGCATTAAAAGTTAGCGATGCGATAACTACAGTTC 

CTGGTGCTAATGCTGGCGATGCCCCGGTTACGGTTAAATTTGGTGCGAACGATACCGCTG 

CTGCCGCAATGGCTAAAACATTGGGAATAAGTGATACATCAGGCTTGTCCCTACATAACG 

TACAAAGCGCGGATGGTAAAGCGACAGGAACCTATGTTGTTCAATCTGGTAATGACTTCT 

ATTCGGCTTCCGTTAATGCTGGTGGCGTTGTTACGCTTAATACCACCAATGTTACTTTCA 

CTGATCCTGCGAACGGTGTTACCACAGCAACACAGACAGGTCAGCCTATCAAGGTCACGA 

CGAATAGTGCTGGCGCGGCTGTTGGCTATGTTACTATTCAAGGCAAAGATTACCTTGCTG 

GTGCAGACGGTAAGGATGCAATTGAAAACGGTGGTGACGCTGCAACAAATGAAGACACAA 

AAATCCAACTTACCGATGAACTCGATGTTGATGGTTCTGTAAAAACAGCGGCAACAGCAA 

CATTTTCTGGTACTGCAACCAACGATCCGCTGGCACTTTTAGACAAAGCTATCTCGCAAG 

TTGATACTTTCCGCTCCTCCCTCGGTGCCGTACAAAACCGTCTGGATTCTGCGGTCACCA 

ACCTGAATAACACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACT 

ATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATCCAGCAGGCGGGTAACTCTG 

TGCTGTCTAAAGCTAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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CTTCTCTTAGCTCTGCTATTGAGCGTCTGTCTTCTGGTCTGCGTATTAACAGCGCAAAAG 

ACGATGCAGCAGGTCAGGCGATTGCTAACCGTTTTACGGCAAATATTAAAGGTCTGACCC 

AGGCTTCCCGTAACGCGAATGATGGTATTTCTGTTGCGCAGACCACTGAAGGTGCGCTGA 

ATGAAATTAACAACAACCTGCAGCGTATTCGTGAACTTTCTGTTCAGGCAACTAACGGTA 

CTAACTCTGACAGCGATCTTTCTTCTATCCAGGCTGAAATTACTCAACGTCTGGAAGAAA 

TTGACCGTGTATCTGAGCAAACTCAGTTTAACGGCGTGAAAGTCCTTGCTGAAAATAATG 

AAATGAAAATTCAGGTTGGTGCTAATGATGGTGAAACCATTGACCTGCCCCCACGATTAG 

ATACAACACTCAGTTAGTAACGTCGGAATCTTCATTCTCAGAATGACCCTTTCTCCAGCC 

CGCTGCAAATTCAGACGGTGTCTGATAATTCAGCGTGGAGTGCGGGCGGCATTCGTTATA 

ATCCTGCCGCCAGTCATTAATAATTTTCCTGGCATGAACGATATCGCTGAACCAGTGCTC 

ATTCAAACATTCATCGCGAAATCGTCCGTTAAAGCTCTCAATAAATCCGTTCTGCGTTGG 

CTTGCCCGGCTGGATTAAGCGCAACTCAACACCATGCTCAAAGGCCCATTGATCCAGTGC 

ACGGCAAGTGAACTCCGGCCCCTGGTCAGTTCTTATCGTCGCCGGATAGCCTCGAAACAG 

TGCAATGCTGTCCAGAATACGCGTGACCTGAACGCCTGAAATCCCAAAGGCAACAGTGAC 

CGTCAGGCATTCCTTTGTGAAATCATCGACGCAGGTAAGACACTTGATCCTGCGACCGGT 

GGAAAGTGCGTCCATGACGAAATCCATCGACCAGGTCAGATTGGGCGCCGCCGGACGGAG 

CAGCGGCAGACGTTCTGTTGCCAGCCCTTTACGACGTCTTCTGCGTTTTACGCCCAGGCC 

ACTGAGGTGATAAAGCCGGTACACGCGCTTATGATTAACATGAAGCCCTTCACGGCGCAG 

CAACTGCCAAATACGACGGTAGCCAAAACGCCTGCGCTCCAGTGCCAGCTCAGTGATGCG 

CCCTGATAAATGCGCATCAGCAGCCGGACGGTGAGCCTCATAGCGGCAGGTCGACAGGGA 

TAAACCTGTAAGCCTGCAGGCACGACGTTGCGACAGACCGGTCGCATCACACATCAACAT 

CACGGCTTCCCGCTTCTGGTCTGTCGTCAGTACTTTCGCCCAAGAGCCACCTGAAGCGCC 

TCTTTATCCAGCATGGCTTCGGCAAGCAGCTTCTTGAGTCTGGTGTTCTCTTCCTCAAGC 

GACTTCAGGCGCTTAACTTCAGGCACCTCCATACCGCCATACTTCTTACGCCAGGTGTAA 

AACGTGGCATCGGAAATGGCATGCTTGCGGCAGAGTTCACGGGCGGGTACCCCAGCTTCG 

GCTTCGCGGAGAATACTGATGATCTGTTCGTCGGAAAAACGCTTCTTCATGGGGATGTCC 

TCATGTGGCTTATGAAGACATTACTAACATCGGGGTGTACTAATCAACGGGGAGCAGGTC 

ACCATCACTATCAATCTGGCAAAAATTGATGCGAAAACTCTCGGCCTGGACGGTTTTAAT 

ATCGATGGCGCGCAGAAAGCAACCGGCAGTGACCTGATTTCTAAATTTAAAGCGACAGGT 

ACTGATAATTATCAAATTAACGGTACTGATAACTATACTGTTAATGTAGATAGTGGAGTA 

GTACAGGATAAAGATGGCAAACAAGTTTATGTGAGTGCTGCGGATGGTTCACTTACGACC 

AGCAGTGATACTCAATTCAAGATTGATGCAACTAAGCTTGCAGTGGCTGCTAAAGATTTA 

GCTCAAGGTAATAAGATTGTCTACGAAGGTATCGAATTTACAAATACCGGCACTGGCGCT 

ATACCTGCCACAGGTAATGGTAAATTAACCGCCAATGTTGATGGTAAGGCTGTTGAATTC 

ACTATTTCGGGGAGTGCTGATACATCAGGTACTAGTGCAACCGTTGCCCCTACGACAGCC 

CTATACAAAAATAGTGCAGGGCAATTGACTGCAACAAAAGTTGAAAATAAAGCAGCGACA 

CTATCTGATCTTGATCTGAACGCTGCCAAGAAAACAGGAAGCACGTTAGTTGTTAACGGT 

GCAACTTACGATGTTAGTGCAGATGGTAAAACGATAACGGAGACTGCTTCTGGTAACAAT 

AAAGTCATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGATTCTGGTAAACGAAGATGCA 

GCAAAATCGTTGCAATCTACCACCAACCCGCTCGAAACTATCGACAAAGCATTGGCTAAA 

GTTGACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGACTCTGCCATCACC 

AACCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGCCGTATCGAAGATGCTGAC 

TACGCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCCTGCAACAAGCGGGTACCTCT 

GTTCTGGCACAGG CTAACC 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTATCGAGCGCCTCTCT 

TCTGGTCTGCGCATTAACAGCGCTAAAGATGACGCTGCGGGCCAGGCGATTGCTAACCGC 

TTCACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAACGACGGTATCTCT 

CTGGCGCAGACCACTGAAGGCGCACTGTCTGAAATCAACAACAACTTGCAGCGTGTTCGT 

GAACTGACCGTTCAGGCCACTACCGGTACTAACTCTGATTCTGACCTGTCTTCAATCCAG 

GACGAAATCAAATCCCGTCTCGATGAAATTGACCGCGTATCCGGTCAGACTCAGTTCAAC 

GGCGTGAACGTACTGGCAAAAGATGGCTCGATGAAAATTCAGGTCGGTGCAAATGATGGT 

CAGACAATCAGCATTGATTTGCAGAAGATTGATTCTTCTACTTTAGGGTTAAATGGTTTT 

TCTGTTTCCAAAAATGCAGTATCTGTTGGTGATGCTATTACTCAATTGCCTGGCGAGACG 

GCAGCCGATGCACCAGTAACCATCAAGTTTGATGATTCAGTAAAAACTGATTTAAAACTG 

ACCGATGCTTCAGGGTTAAGTCTGCATAACCTCAAAGATGAAAATGGTAATTTAACTAAC 

CAGTATGTTGTACAGAATGGCGGAAAATCTTACGCTGCTACAGTCGCTGCCAATGGTAAT 

GTTACGCTGAACAAAGCAAATGTAACCTACAGCGATGTCGCAAACGGTATTGATACCGCA 

ACGCAGTCAGGCCAGTTAGTTCAGGTTGGTGCAGATTCTACCGGTACGCCAAAAGCATTC 

GTGTCTGTCCAAGGTAAAAGCTTTGGCATTGATGACGCCGCCTTGAAGAATAACACTGGT 

GATGCTACCGCTACTCCACCGGGAACATCTGGGACAACAGTTGTCGCAGCGTCAATTCAT 

CTGAGTACGGGCAAAAACTCTGTAGACGCTGATGTAACGGCTTCCACTGAATTCACAGGT 

GCTTCAACCAACGATCCACTGACTCTGCTGGACAAAGCTATCGCATCTGTTGATAAATTC 

CGTTCTTCTTTGGGGGCGGTACAGAACCGTCTGAGCTCCGCTGTAACCAACCTGAACAAC 

ACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAA 

GTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCAGGTAACTCCGTGCTGTCCAAA 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTATCGAACGCCTCTCTTCTGG 

CCTGCGTATTAACAGTGCGAAAGATGACGCTGCCGGTCAGGCGATAGCTAACCGTTTCAC 

CTCTAACATTAAAGGCCTGACTCAGGCTGCGCGTAACGCCAACGACGGTATTTCTCTGGC 

GCAGACCACAGAAGGTGCGTTGTCTGAAATCAACAACAACTTGCAACGTGTGCGTGAGTT 

GACCGTTCAGGCGACGACCGGTACTAACTCTGATTCTGACCTGTCATCTATTCAGGACGA 

AATCAAATCCCGTCTGGATGAGATTGACCGTGTTTCCGGTCAGACCCAGTTCAACGGCGT 

GAATGTACTGGCAAAAGACGGTTCGATGAAGATTCAGGTTGGCGCGAATGATGGCCAGAC 

TATTAGCATTGATTTACAGA/^AATTGACTCTTCTACATTAGGGTTGAATGGTTTCTCCGT 

TTCTGCTCAATCACTTAACGTTGGTGATTCAATTACTCT^AATTACAGGAGCCGCTGGGAC 

AAAACCTGTTGGTGTTGATTTCACTGCTGTTGCGAAAGATCTGACTACTGCGACAGGTAA 

AACTGTCGATGTTTCCAGCCTGACGTTACACAACACCCTGGATGCGA/^GGGGCTGCCAC 

CGCACAGTTCGTCGTTCAATCCGGTAGTGATTTCTACTCCGCGTCCATTGACCATGCAAG 

TGGTGAAGTGACGTTGAATAAAGCCGATGTCGAATACAAAGACACCGATAATGGACTAAC 

GACTGCAGCTACTCAGAAAGATCAGCTGATTAAAGTTGCCGCTGACTCTGACGGCGCGGC 

TGCGGGATATGTAACATTCCAGGGTAAAAACTACGCTACAACGGCTCCAGCGGCGCTTAA 

TGATGACACTACGGCAACAGCCACAGCGAACAAAGTTGTTGTTGAATTATCTACAGCAAC 

TCCGACTGCGCAGTTCTCAGGGGCTTCTTCTGCTGATCCACTGGCACTTTTAGACAAAGC 

CATTGCACAGGTTGATACTTTCCGCTCCTCCCTCGGTGCCGTTCAAAACCGTCTGGACTC 

TGCGGTAACCAACCTGAACAACACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCA 

GGACGCCGACTATGCGACCGAAGTGTCTAACATGTCGAAAGCGCAGATCATCCAGCAGGC 

GGGTAACTCTGTGCTGTCTAAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCACAGAC CACTGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATCCGTG AGCTGACGGT TCAGGCTTCT 
ACCGGGACTA ACTCTGATTC GGATCTGGAC TCCATTCAGG ACGAAATCAA ATCCCGTCTC 
GACGAAATTG ACCGCGTATC CGGTCAGACC CAGTTCAACG GCGTGAACGT ACTGGCAAAA 
GACGGTTCGA TGAAAATTCA GGTTGGTGCG AATGACGGTG AAACTATCAC TATCGACCTG 
AAGAAAATCG ATTCTGATAC TCTGGGTCTG AATGGTTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC CGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACCAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGG 
GCAGGCTACT GATTCAGCTA AAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTT^ AGCTGATATG AAAGCGCTGC TTAAAGCCGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACTGAATATA CTATCGCAAA 
AGCAACTCCT GCGACAACCT CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATT ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCTATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CTGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
TATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCCAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCAGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CGGCCCGTAA CGCCAACGAC GGTATTTCTG TTGCGCAGAC CACCGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATTCGTG AACTGACGGT TCAGGCCACT 
ACAGGGACTA ACTCCGATTC TGACCTGGAC TCCATCCAGG ACGAAATCAA ATCTCGTCTT 
GATGAAATTG ACCGCGTATC CGGCCAGACC CAGTTCAACG GCGTGAACGT GCTGGCGAAA 
GACGGTTCAA TGAAAATTCA GGTTGGTGCG AATGACGGCG AAACCATCAC GATCGACCTG 
AAAAAAATCG ATTCTGATAC TCTGGGTCTG AATGGCTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC AGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACTAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTGCT GATTCAGCTT CAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TCAAAGCAGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACAGAATATA CCATCGCAAA 
AGCAACTCCT GCGACAACCA CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATC ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCAATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CCGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
CATTCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCTAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCAGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCGCAGAC CACCGAAGGC 
GCGCTGTCCG AAATCAACT^ CAACTTACAG CGTATTCGTG AACTGACGGT TCAGGCCACT 
ACAGGGACTA ACTCCGATTC TGACCTGGAC TCCATCCAGG ACGAAATCAA ATCTCGTCTT 
GATGAAATTG ACCGCGTATC CGGCCAGACC CAGTTCAACG GCGTGAACGT GCTGGCGAAA 
GACGGTTCAA TGAAAATTCA GGTTGGTGCG AATGACGGCG AAACCATCAC GATCGACCTG 
AAAAAAATCG ATTCTGATAC TCTGGGTCTG AATGGCTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC AGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACTAAATC TACTGCTGGT ACGGGTGTAA ACGCCGCGGC 
GCAGGCTGCT GATTCAGCTT CAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TCAAAGCAGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACAGAATATA CCATCGCAAA 
AGCAACTCCT GCGACAACCA CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATC ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGCACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCAATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCGGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CCGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
CATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCTAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 



Figure 61 





wo 99/61458 



PCT/AU99/00385 



84/96 



ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCACAGAC CACTGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATCCGTG AGCTGACGGT TCAGGCTTCT 
ACCGGGACTA ACTCTGATTC GGATCTGGAC TCCATTCAGG ACGAAATCAA ATCCCGTCTC 
GACGAAATTG ACCGCGTATC CGGTCAGACC CAGTTCAACG GCGTGAACGT ACTGGCAAAA 
GACGGTTCGA TGAAAATTCA GGTTGGTGCG AATGACGGTG AAACTATCAC TATCGACCTG 
AAGAAAATCG ATTCTGATAC TCTGGGTCTG AATGGTTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC CGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACCAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTACT GATTCAGCTA AAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TTAAAGCCGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACTGAATATA CTATCGCAAA 
AGCAACTCCT GCGACAACCT CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATT ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCTATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTG7VA CAACACCACT ACCAACCTGT CTGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
TATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCCAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAATAATATCAACAAG 

AACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTTGCGTATTAACAGC 

GCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTTACTTCTAACATTAAAGGC 

CTGACTCAGGCGGCCCGTAACGCCAACGACGGTATTTCTGTTGCGCAGACCACCGAAGGC 

GCGCTGTCCGAAATCAACT^CAACTTACAGCGTATTCGTGAACTGACGGTTCAGGCCACT 

ACAGGGACTAACTCCGATTCTGACCTGGACTCCATCCAGGACGAAATCAAATCTCGTCTT 

GATGAAATTGACCGCGTATCCGGCCAGACCCAGTTCAACGGCGTGAACGTGCTGGCGAAA 

GACGGTTCAATGAAAATTCAGGTTGGTGCGAATGACGGCGAAACCATCACGATCGACCTG 

AAAAAAATCGATTCTGATACTCTGGGTCTGAATGGCTTTAACGTAAATGGTAAAGGTACT 

ATTACCAACAAAGCTGCAACGGTAAGTGATTTAACTTCTGCTGGCGCGAAGTTAAACACC 

ACGACAGGTCTTTATGATCTGAAAACCGAAAATACCTTGTTAACTACCGATGCTGCATTC 

GATAAATTAGGGAATGGCGATAAAGTCACAGTTGGCGGCGTAGATTATACTTACAACGCT 

AAATCTGGTGATTTTACTACCACTAAATCTACTGCTGGTACGGGTGTAGACGCCGCGGCG 

CAGGCTGCTGATTCAGCTTCAAAACGTGATGCGTTAGCTGCCACCCTTCATGCTGATGTG 

GGTAAATCTGTTAATGGTTCTTACACCACAAAAGATGGTACTGTTTCTTTCGAAACGGAT 

TCAGCAGGTAATATCACCATCGGTGGAAGCCAGGCATACGTAGACGATGCAGGCAACTTG 

ACGACTAACAACGCTGGTAGCGCAGCTAAAGCTGATATGAAAGCGCTGCTCAAAGCAGCG 

AGCGAAGGTAGTGACGGTGCCTCTCTGACATTCAATGGCACAGAATATACCATCGCAAAA 

GCAACTCCTGCGACAACCACTCCAGTAGCTCCGTTAATCCCTGGTGGGATTACTTATCAG 

GCTACAGTGAGTAAAGATGTAGTATTGAGCGAAACCAAAGCGGCTGCCGCGACATCTTCA 

ATTACCTTTAATTCCGGTGTACTGAGCAAAACTATTGGGTTTACCGCGGGTGAATCCAGT 

GATGCTGCGAAGTCTTATGTGGATGATAAAGGTGGTATCACTAACGTTGCCGACTATACA 

GTCTCTTACAGCGTTAACAAGGATAACGGCTCTGTGACTGTTGCCGGGTATGCTTCAGCG 

ACTGATACCAATAAAGATTATGCTCCAGCAATTGGTACTGCTGTAZ^TGTGAACTCCGCG 

GGTAAAATCACTACTGAGACTACCAGTGCTGGTTCTGCAACGACCAACCCGCTTGCTGCC 

CTGGACGACGCAATCAGCTCCATCGACAAATTCCGTTCTTCCCTGGGTGCTATCCAGAAC 

CGTCTGGATTCCGCAGTCACCAACCTGAACAACACCACTACCAACCTGTCCGAAGCGCAG 

TCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATC 

ATTCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTG 

TCTCTGCTGCAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCAGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCGCAGAC CACCGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATTCGTG AACTGACGGT TCAGGCCACT 
ACAGGGACTA ACTCCGATTC TGACCTGGAC TCCATCCAGG ACGAAATCAA ATCTCGTCTT 
GATGAAATTG ACCGCGTATC CGGCCAGACC CAGTTCAACG GCGTGAACGT GCTGGCGAAA 
GACGGTTCAA TGAAAATTCA GGTTGGTGCG AATGACGGCG AAACCATCAC GATCGACCTG 
AAAAAAATCG ATTCTGATAC TCTGGGTCTG AATGGCTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC AGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACTAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTGCT GATTCAGCTT CAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TCAAAGCAGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACAGAATATA CCATCGCAAA 
AGCAACTCCT GCGACAACCA CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATC ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGCACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCAATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCGGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CCGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
CATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCTAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCACAGAC CACTGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATCCGTG AGCTGACGGT TCAGGCTTCT 
ACCGGGACTA ACTCTGATTC GGATCTGGAC TCCATTCAGG ACGAAATCAA ATCCCGTCTC 
GACGAAATTG ACCGCGTATC CGGTCAGACC CAGTTCAACG GCGTGAACGT ACTGGCAAAA 
GACGGTTCGA TGAAAATTCA GGTTGGTGCG AATGACGGTG AAACTATCAC TATCGACCTG 
AAGAAAATCG ATTCTGATAC TCTGGGTCTG AATGGTTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACACC 
ACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC CGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACCAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTACT GATTCAGCTA AAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TTAAAGCCGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACTGAATATA CTATCGCAAA 
AGCAACTCCT GCGACAACCT CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTTCTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATT ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCTATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CTGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
TATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCCAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCAGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CGGCCCGTAA CGCCAACGAC GGTATTTCTG TTGCGCAGAC CACCGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATTCGTG AACTGACGGT TCAGGCCACT 
ACAGGGACTA ACTCCGATTC TGACCTGGAC TCCATCCAGG ACGAAATCAA ATCTCGTCTT 
GATGAAATTG ACCGCGTATC CGGCCAGACC CAGTTCAACG GCGTGAACGT GCTGGCGAAA 
GACGGTTCAA TGAAAATTCA GGTTGGTGCG AATGACGGCG AAACCATCAC GATCGACCTG 
AAAAAAATCG ATTCTGATAC TCTGGGTCTG AATGGCTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC AGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACTAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTGCT GATTCAGCTT CAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TCAAAGCAGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACAGAATATA CCATCGCAAA 
AGCAACTCCT GCGACAACCA CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATC ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA, ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCAATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CCGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
CATTCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCTAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCACAGAC CACCGAAGGC 
GCGCTGTCTG AAATCAACAA CAACTTACAG CGTATCCGTG AGCTGACGGT TCAGGCTTCT 
ACCGGAACTA ACTCTGATTC GGATCTGGAC TCCATTCAGG ACGAAATCAA ATCCCGTCTT 
GATGAAATTG ACCGCGTATC CGGCCAGACC CAGTTCAACG GCGTGAACGT ACTGGCAAAA 
GACGGTTCGA TGAAAATTCA GGTTGGTGCG AATGACGGTG AAACTATCAC TATCGACCTG 
AAGAAAATCG ATTCTGATAC TCTGGGTCTG AATGGTTTTA ACGTA/^TGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC CGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACCAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTACT GATTCAGCTA AAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TTAAAGCCGC 
GAGCGAAGGT AGTGACGGTG CTTCTCTGAC ATTCAATGGC ACTGAATATA CTATCGCAAA 
AGCAACTCCT GCGACAACCT CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATT ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCTATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CTGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
TATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCCAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCACAGAC CACTGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATCCGTG AGCTGACGGT TCAGGCTTCT 
ACCGGGACTA ACTCTGATTC GGATCTGGAC TCCATTCAGG ACGAAATCAA ATCCCGTCTC 
GACGAAATTG ACCGCGTATC CGGTCAGACC CAGTTCAACG GCGTGAACGT ACTGGCAAAA 
GACGGTTCGA TGAAAATTCA GGTTGGTGCG AATGACGGTG AAACTATCAC TATCGACCTG 
AAGAAAATCG ATTCTGATAC TCTGGGTCTG AATGGTTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC CGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACCAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGG 
GCAGGCTACT GATTCAGCTA AAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TTAAAGCCGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACTGAATATA CTATCGCAAA 
AGCAACTCCT GCGACAACCT CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTTCTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT T^TTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATT ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCTATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CTGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
TATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCCAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGCGACGTATAGAACGAATACCGGGGTTATCGGCGTAAGCGGGGCAAA 

GTTTACGATTTATTTTTTGGCTTAATGACACGAACAGCAACGAGGAAGGG 

GAGTATTTCGACCGCTAGAAAAAAATTCTAAAGGTTGTGAGTGACCAGAC 

GATAACAGGGTTGACGGCGACGAAGCCGAAGGGTGGAAGCCCAATACTT 

AAACCGTAGACTTGAAAACAGGAAAATGAATCATGGCACAAGTCATTAAT 

ACCAACAGCCTCTCGCTGATCACTCAAAATAATATCAACAAGAACCAGTC 

TGCGCTGTCGACTTCTATCGAGCGCCTCTCTTCTGGTCTGCGCATTAACAG 

CGCTAAAGATGACGCTGCGGGCCAAGCGATTGCTAACCGCTTCACTTCTA 

ACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAACGACGGTATTTCT 

CTGGCGCAGACCACTGAAGGCGCACTGTCTGAAATCAACAACAACTTGCA 

GCGTGTTCGTGAACTGACCGTTCAGGCCACTACCGGTACTAACTCTGATTC 

TGACCTGTCTTCAATACAGGACGAAATCAAATCCCGTCTCGATGAAATTG 

ACCGCGTATCCGGTCAGACTCAGTTCAACGGCGTTAATGTTCTTTCCAAAG 

ATGGTTCAATGAAAATTCAGGTTGGTGCGAATGATGGTCAAACTATCTCC 

ATCGATCTGAAGAAAATTGATTCTTCAACTTTGGGGCTGAATGGCTTCTCA 

GTTTCTAAAAACTCTCTTAATGTCAGCAATGCTATCACATCTATCCCGCAA 

GCCGCTAGCAATGAACCTGTTGATGTTAACTTCGGTGATACTGATGAGTCT 

GCAGCAATCGCAGCCAAATTGGGGGTTTCCGATACGTCAAGCCTGTCGCT 

GCACAACATCCTTGATAAAGATGGTAAGGCAACAGCTGATTATGTTGTTC 

AGTCAGGTAAAGACTTCTATGCTGCTTCTGTTAATGCCGCTTCAGGTAAAG 

TAACCTTAAACACCATTGATGTTACTTATGATGATTATGCGAACGGTGTTG 

ACGATGCCAAGCAAACAGGTCAGCTGATCAAAGTTTCAGCAGATAAAGAC 

GGCGCAGCTCAAGGTTTTGTCACACTTCAAGGCAAAAACTATTCTGCTGGT 

GATGCGGCAGACATTCTTAAGAATGGAGCAACAGCTCTTAAGTTAACTGA 

TCTGAATTTAAGTGATGTTACTGATACTAATGGTAAGGTAACCACAACTGC 

GACTGAGCAATTTGAAGGTGCTTCAACTGAGGATCCGCTGGCGCTTCTGG 

ATAAAGCTATTGCATCAGTCGACAAATTCCGGTCTTCTCTAGGTGCCGTGC 

AGAACCGTCTCGATTCCGCTATCACCAACCTGAACAACACCACCACCAAC 

CTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGT 

GTCCAACATGTCGAAAGCGCAGATCATCCAGCAGGCAGGTAACTCCGTGC 

TGTCTAAAGCGAACCAGGTACCGCAGCAAGTTCTGTCACTGTTACAAGGC 

TAATGGCCTTAACCTGCCTGACCCCGCCACCGGCGGGGTTTTTTCTGTCCG 

CAATTTACCGATAACCCCCAAATAACCCCTCATTTCACCCACTAATCGTCC 

GATTAAAAACCCTGCAGAAACGGATAATCATGCCGATAACTCATATAACG 

CAGGGCTGTTTATCGTGAATTCACTCTATACCGCTGAAGGTGTAATGGATA 

AACACTCGCTG 
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AACAGCCTCTCGCTGATCACTCAGAACAACATCAACAAAAACCAGTCTTC 

AATGTCTACTGCCATTGAGCGTCTGTCTTCCGGTCTGCGTATCAACAGCGC 

AAAAGATGACGCTGCTGGCCAGGCGATTGCCAACCGCTTCACCTCTAACA 

TCAAAGGTCTGACTCAGGCAGCTCGTAACGCCAACGACGGTATCTCCGTT 

GCACAGACCACTGAAGGCGCACTGTCTGAAATCAACAACAACCTGCAGCG 

TATCCGTGAGCTGACTGTTCAGTCTTCTACGGGTACTAACTCTGAATCCGA 

TCTGAACTCAATCCAGGACGAAATTAAATCCCGTCTGGACGAAATTGACC 

GCGTATCCGGTCAGACCCAGTTCAACGGCGTGAACGTGCTGGCAAAAGAC 

GGCTCCATGAAAATTCAGGTTGGCGCGAACGATGGTGAAACCATCACCAT 

CGACCTGAAAAAAATTGACTCTTCTACTTTAAACCTGACTGGGTTTAA 



Figure 70A 
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93/96 

CTCAGTATGCTGTCACCGGCAGTACAGGTGCCGTAACTTACGATCCAGAT 

ACAGATCCTGCCGCGACTGGTGATATTGTTTCTGCTTATGTTGATGATGCA 

GGTACATTGACAACTGATGCAAACAAAACTGTAAAATATTATGCCCACAC 

TAATGGTAGCGTCACGAACGACAGTGGTTCAGCTATTTACGCAACTGAAG 

CGGGCAAATTGACTACTGAAGCGTCTACAGCTGCTGAAACTACCGCTAAC 

CCACTGAAAGCCCTGGACGATGCAATCAGCCAGATCGACAAATTCCGTTC 

TTCTCTGGGTGCTGTACAGAACCGTCTGGATTCTGCGGTAACCAACCTGAA 

CAACACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCG 

ACTATGCGACCGAAGTGTCAAATATGTCTAAAGCGCAGATCATCCAGCAG 

GCCGGTAACTCCGTGTTGGCTAAAGCTAACCAGGTTCCTCAGCAGGTT 



Figure 70B 
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94/96 

AGCCTGTCGCTGTTGACCCAGAATAACCTGAACAAATCTCAGTCTTCTCTG 

AGCTCCGCCATTGAGCGTCTCTCTTCTGGCCTGCGTATTAACAGTGCTAAA 

GATGACGCAGCAGGTCAGGCGATTGCTAACCGTTTTACAGCAAATATTAA 

AGGTCTGACTCAGGCTTCCCGTAACGCGAATGATGGTATTTCTGTTGCGCA 

GACCACTGAAGGCGCGCTGAATGAAATTAACAACAACCTGCAGCGTGTAC 

GTGAACTGACTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATCTTT 

CTTCTATCCAGGCTGAAATTACTCAACGTCTGGAAGAAATTGACCGTGTAT 

CTGAGCAAACTCAGTTTAACGGCGTGAAAGTCCTTGCTGAAAAT 



Figure 71 
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95/96 



GCACGTTAGTTGTTAACGGTGCAACTTACGATGTTAGTGCAGATGGTAAA 

ACGATAACGGAGACTGCTTCTGGTAACAATAAAGTCATGTATCTGAGCAA 

ATCAGAAGGTGGTAGCCCGATTCTGGTAAACGAAGATGCAGCAAAATCGT 

TGCAATCTACCACCAACCCGCTCGAAAeTATCGACAAAGCATTGGCTAAA 

GTTGACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGACTCT 

GCTATCACCAACCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGC 

CGTATCGAAGATGCTGACTACGCGACCGAAGTGTCTAACATGTCTCGTGC 

GCAGATCCTGCAACAAGCGGGTACCTCTGTTCTGGCGCAGGCTAACCAGA 

CCACGCAGAACGTAC 



Figure 72 
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- 1 - 

SEQUENCE LISTING PART 

<110> THE UNIVERSITY OF SYDNEY 

<12 0> ANTIGENS AND THEIR DETECTION 

<13 0> REEVES 

<140> 
<141> 

<160> 68 

<170> PatentIn Ver . 2.0 

<210> 1 
<211> 1773 
<212> DNA 

<213> Escherichia coli 
<400> 1 

atgcgacgta tagaacgaat accggggtta tcggcgtaag cggggcaaag tttacgattt 60 
attttttggc ttaatgacac gaacagcaac gaggaagggg agtatttcga ccgctagaaa 120 
aaaattctaa aggttgtgag tgaccagacg ataacagggt tgacggcgac gaagccgaag 180 
ggtggaagcc caatacttaa accgtagact tgaaaacagg aaaatgaatc atggcacaag 240 
tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag aaccagtctg 300 
cgctgtcgac ttctatcgag cgcctctctt ctggtctgcg cattaacagc gctaaagatg 36 0 
acgctgcggg ccaagcgatt gctaaccgct tcacttctaa catcaaaggt ctgactcagg 420 
ccgcacgtaa cgccaacgac ggtatttctc tggcgcagac cactgaaggc gcactgtctg 480 
aaatcaacaa caacttgcag cgtgttcgtg aactgaccgt tcaggccact accggtacta 540 
actctgattc tgacctgtct tcaatacagg acgaaatcaa atcccgtctc gatgaaattg 600 
accgcgtatc cggtcagact cagttcaacg gcgttaatgt tctttccaaa gatggttcaa 660 
tgaaaattca ggttggtgcg aatgatggtc aaactatctc catcgatctg aagaaaattg 720 
attcttcaac tttggggctg aatggcttct cagtttctaa aaactctctt aatgtcagca 780 
atgctatcac atctatcccg caagccgcta gcaatgaacc tgttgatgtt aacttcggtg 840 
atactgatga gtctgcagca atcgcagcca aattgggggt ttccgatacg tcaagcctgt 900 
cgctgcacaa catccttgat aaagatggta aggcaacagc tgattatgtt gttcagtcag 960 
gtaaagactt ctatgctgct tctgttaatg ccgcttcagg taaagtaacc ttaaacacca 1020 
ttgatgttac ttatgatgat tatgcgaacg gtgttgacga tgccaagcaa acaggtcagc 1080 
tgatcaaagt ttcagcagat aaagacggcg cagctcaagg ttttgtcaca cttcaaggca 1140 
aaaactattc tgctggtgat gcggcagaca ttcttaagaa tggagcaaca gctcttaagt 1200 
taactgatct gaatttaagt gatgttactg atactaatgg taaggtaacc acaactgcga 1260 
ctgagcaatt tgaaggtgct tcaactgagg atccgctggc gcttctggat aaagctattg 1320 
catcagtcga caaattccgg tcttctctag gtgccgtgca gaaccgtctc gattccgcta 13 80 
tcaccaacct gaacaacacc accaccaacc tgtctgaagc gcagtcccgt attcaggacg 1440 
ccgactatgc gaccgaagtg tccaacatgt: cgaaagcgca gatcatccag caggcaggta 1500 
actccgtgct gtctaaagcg aaccaggtac cgcagcaagt tctgtcactg ttacaaggct 1560 
aatggcctta acctgcctga ccccgccacc ggcggggttt tttctgtccg caatttaccg 1620 
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ataaccccca aataacccct catttcaccc actaatcgtc cgattaaaaa ccctgcagaa 1680 
acggataatc atgccgataa ctcatataac gcagggctgt ttatcgtgaa ttcactctat 1740 
accgctgaag gtgtaatgga taaacactcg ctg 1773 

<210> 2 
<211> 500 
<212> DNA 

<213> Escherichia coli 



<400> 2 

aacagcctct cgctgatcac tcagaacaac atcaacaaaa accagtcttc aatgtctact 60 

gccattgagc gtctgtcttc cggtctgcgt atcaacagcg caaaagatga cgctgctggc 120 

caggcgattg ccaaccgctt cacctctaac atcaaaggtc tgactcaggc agctcgtaac 180 

gccaacgacg gtatctccgt tgcacagacc actgaaggcg cactgtctga aatcaacaac 240 

aacctgcagc gtatccgtga gctgactgtt cagtcttcta cgggtactaa ctctgaatcc 3 00 

gatctgaact caatccagga cgaaattaaa tcccgtctgg acgaaattga ccgcgtatcc 360 

ggtcagaccc agttcaacgg cgtgaacgtg ctggcaaaag acggctccat gaaaattcag 420 

gttggcgcga acgatggtga aaccatcacc atcgacctga aaaaaattga ctcttctact 480 
ttaaacctga ctgggtttaa 5q0 

<210> 3 
<211> 500 
<212> DNA 

<213> Escherichia coli 



<400> 3 

ctcagtatgc tgtcaccggc agtacaggtg ccgtaactta cgatccagat acagatcctg 60 

ccgcgactgg tgatattgtt tctgcttatg ttgatgatgc aggtacattg acaactgatg 12 0 

caaacaaaac tgtaaaatat tatgcccaca ctaatggtag cgtcacgaac gacagtggtt 180 

cagctattta cgcaactgaa gcgggcaaat tgactactga agcgtctaca gctgctgaaa 240 

ctaccgctaa cccactgaaa gccctggacg atgcaatcag ccagatcgac aaattccgtt 3 00 

cttctctggg tgctgtacag aaccgtctgg attctgcggt aaccaacctg aacaacacca 360 

ccaccaacct gtctgaagcg cagtcccgta ttcaggacgc cgactatgcg accgaagtgt 420 

caaatatgtc taaagcgcag atcatccagc aggccggtaa ctccgtgttg gctaaagcta 480 
accaggttcc tcagcaggtt 500 

<210> 4 
<211> 399 
<212> DNA 

<213> Escherichia coli 



<400> 4 

agcctgtcgc tgttgaccca gaataacctg 
attgagcgtc tctcttctgg cctgcgtatt 
gcgattgcta accgttttac agcaaatatt 
aatgatggta tttctgttgc gcagaccact 
ctgcagcgtg tacgtgaact gactgttcag 
ctttcttcta tccaggctga aattactcaa 



aacaaatctc agtcttctct gagctccgcc 60 

aacagtgcta aagatgacgc agcaggtcag 120 

aaaggtctga ctcaggcttc ccgtaacgcg 180 

gaaggcgcgc tgaatgaaat taacaacaac 240 

gcaactaacg gtactaactc tgacagcgat 3 00 

cgtctggaag aaattgaccg tgtatctgag 3 60 
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caaactcagt ttaacggcgt gaaagtcctt gctgaaaat 

<210> 5 
<211> 417 
<212> DNA 

<213> Escherichia coli 



<400> 5 

gcacgttagt tgttaacggt gcaacttacg atgttagtgc agatggtaaa acgataacgg 60 
agactgcttc tggtaacaat aaagtcatgt atctgagcaa atcagaaggt ggtagcccga 120 
ttctggtaaa cgaagatgca gcaaaatcgt tgcaatctac caccaacccg ctcgaaacta 180 
tcgacaaagc attggctaaa gttgacaatc tgcgttctga cctcggtgca gtacaaaacc 240 
gtttcgactc tgctatcacc aaccttggca acaccgtaaa caacctgtct tctgcccgta 300 
gccgtatcga agatgctgac tacgcgaccg aagtgtctaa catgtctcgt gcgcagatcc 3 60 
tgcaacaagc gggtacctct gttctggcgc aggctaacca gaccacgcag aacgtac 417 

<210> 6 
<211> 950 
<212> DNA 

<213> Escherichia coli 



<400> 6 

aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tctcttctgg tctgcgtatt 60 
aacagcgcta aagatgacgc cgcgggccag gcgattgcta accgctttac ttctaacatc 120 
aaaggtctga ctcaggccgc acgtaacgcc aacgacggta tttctctggc gcagacggct 180 
gaaggcgcgc tgtcagagat taacaacaac ttgcagcgta ttcgtgaact gaccgttcag 240 
gcctctaccg gcacgaactc tgattccgac ctgtcttcta ttcaggacga aatcaaatcc 300 
cgtcttgatg aaattgaccg tgtatctggt cagacccagt tcaacggtgt gaacgtgctg 360 
tcgaaaaacg attcgatgaa gattcagatt ggtgccaatg ataaccagac gatcagcatt 420 
ggcttgcaac aaatcgacag taccactttg aatctgaaag gatttaccgt gtccggcatg 480 
gcggatttca gcgcggcgaa actgacggct gctgatggta cagcaattgc tgctgcggat 540 
gtcaaggatg ctgggggtaa acaagtcaat ttactgtctt acactgacac cgcgtctaac 600 
agtactaaat atgcggtcgt tgattctgca accggtaaat acatggaagc cactgtagtc 660 
attaccggta cggcggcggc ggtaactgtt ggtgcagcgg aagtggcggg agccgctaca 720 
gccgatccgt taaaagcact ggatgccgca atcgctaaag tcgacaaatt ccgctcctcc 780 
ctcggtgccg ttcaaaaccg tctggattct gcggtcacca acctgaacaa caccaccacc 840 
aacctgtctg aagcgcagtc ccgtattcag gacgccgact atgcgaccga agtgtccaac 900 
atgtcgaaag cgcagattat ccagcaggcg ggcaactccg tgctgtctaa 950 

<210> 7 
<211> 1212 
<212> DNA 

<213> Escherichia coli 



<400> 7 

aacaaaaacc agtctgcgct gtcgacttct 
aacagcgcta aagatgacgc cgcgggccag 
aaaggtctga ctcaggccgc acgtaacgcc 



atcgagcgcc tctcttctgg tctgcgtatt 60 
gcgattgcta accgcttcac ttctaacatc 120 
aacgacggta tctctctggc gcagaccact 180 
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gaaggcgcgc tgtctgaaat caacaacaac ttgcagcgtg tgcgtgagtt gaccgttcag 240 

gcgacgaccg ggactaactc tgattctgac ctgtcttcta ttcaggacga aatcaaatcc 300 

cgtctggatg aaattgatcg cgtttccggt cagacccagt tcaacggcgt gaatgtgctg 360 

gcgaaagatg gttcgatgaa gattcaggtt ggcgcgaatg atgggcagac tattagcatt 420 

^ gatttgcaga agattgactc ttctacatta ggactgaacg gtttctccgt ttcgggtcag 480 

^ tcacttaacg ttagtgattc cattactcaa attaccggtg ccgccgggac aaaacctgtt 540 

ggtgttgatt tcactgctgt tgcgaaagat ctgactactg cgacaggtaa aacagtcgat 600 

■ gtttctagcc tgacgttaca caacactctg gatgcgaaag gggctgctac atcacagttc 660 

gtcgttcaat ccggcaatga tttctactcc gcgtcgatta atcatacaga cggcaaagtc 720 

acgttgaata aagccgatgt cgaatacaca gacaccgata atggactaac gactgcggct 780 

actcagaaag atcaactgat taaagttgcc gctgactctg acggctcggc tgcgggatat 840 

gtaacattcc aaggtaaaaa ctacgctaca acggtttcaa cggcacttga tgataatact 900 

gcggcaaaag caacagataa taaagttgtt gttgaattat caacagcaaa accgactgca 960 

cagttctcag gggcttcttc tgctgatcca ctggcacttt tagacaaagc tattgcacag 1020 

gttgatactt tccgctcctc cctcggtgcg gtgcaaaacc gtctggattc cgcagtaacc 1080 

aacctgaaca acaccaccac caacctgtct gaagcgcagt cccgtattca ggacgccgac 1140 

tatgctacag aagtgtccaa catgtcgaaa gcgcagatca tccagcaggc aggtaactcg 1200 
gtgctgtcca aa 1212 

<210> 8 
<211> 1647 
<212> DNA 

<213> Escherichia coli 
<400> 8 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 

aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 

gcgaaggatg acgccgcggg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 

ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcgcagac caccgaaggc 240 

gcgctgtccg aaattaacaa caacttacag cgtattcgtg aactgacggt tcaggcttct 300 

accgggacta actctgattc ggatctggac tccattcagg acgaaatcaa atcccgtctc 360 

gacgaaattg accgcgtatc cggtcagacc cagttcaacg gcgtgaacgt actggcaaaa 420 

gacggttcga tgaaaattca ggttggtgcg aatgacggcc agactatcac tattgatctg 480 

aagaaaattg actctgatac gctggggctg aatgggttta atgtgaacgg caaaggggaa 540 

acggctaata cggcagcaac cctgaaagat atgtctggat tcacagctgc ggcggcacca 600 

gggggaactg ttggtgtaac tcaatatact gacaaatcgg ctgtagcaag tagcgtagat 660 

attctaaatg ctgttgctgg cgcagatgga aataaagtta caactagcgc cgatgttggt 720 

tttggtacac cagccgctgc tgtaacctat acctacaata aagacactaa ttcatattcc 780 

gccgcttctg atgatatttc cagcgctaac ctggctgctt tcctcaatcc tcaggccgga 840 

gatacgacta aagctacagt tacaattggt ggcaaagatc aagatgtaaa catcgataaa 900 

tccggtaatt taactgctgc tgatgatggc gcagtacttt atatggatgc taccggtaac 960 

ttaactaaaa ataatgctgg tggtgataca caagctactt tggctaaact tgctactgct 1020 

actggtgcta aagccgcgac catccaaact gataaaggaa cattcaccag tgacggtaca 1080 

^ gcgtttgatg gtgcatcaat gtccattgat accaatacat ttgcaaatgc agtaaaaaat 1140 

gacacttata ctgccactgt aggtgctaag acttatagcg taacaacagg ttctgctgct 1200 

gcagacaccg cttatatgag caatggggtt ctcagtgata ctccgccaac ttactatgca 1260 

caagctgatg gaagtatcac aactactgag gatgcggctg ccggtaaact ggtctacaaa 1320 

ggttccgatg gtaagttaac aacggatacg actagcaaag cagaatcaac atcagatccg 13 80 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 





wo 99/61458 



- 5 - 



PCT/AU99/00385 



ctggcagctc ttgacgacgc tatcagccag atcgacaaat tccgctcctc cctgggtgcg 1440 
gtgcaaaacc gtctggattc cgcagtgacc aacctgaaca acaccactac caacctgtct 1500 
gaagcgcagt cccgtattca ggacgccgac tatgcgaccg aagtgtccaa catgtcgaaa 1560 
gcgcagatta tccagcaggc cggtaactcc gtgctggcaa aagctaacca ggttccgcag 1620 



<210> 9 
<211> 1758 
<212> DNA 

<213> Escherichia coli 
<400> 9 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcggg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcacagac caccgaaggc 240 
gcgctgtctg aaatcaacaa caacttacag cgtatccgtg agctgacggt tcaggcttct 300 
accggaacta actctgattc ggatctggac tccattcagg acgaaatcaa atcccgtctt 3 60 
gatgaaattg accgcgtatc cggccagacc cagttcaacg gcgtgaacgt actggcaaaa 420 
gacggttcga tgaaaattca ggttggtgcg aatgacggtg aaactatcac tatcgacctg 480 
aagaaaatcg attctgatac tctgggtctg aatggtttta acgtaaatgg taaaggtact 540 
attaccaaca aagctgcaac ggtaagtgat ttaacttctg ctggcgcgaa gttaaacacc 600 
acgacaggtc tttatgatct gaaaaccgaa aataccttgt taactaccga tgctgcattc 660 
gataaattag ggaatggcga taaagtcacc gttggcggcg tagattatac ttacaacgct 720 
aaatctggtg attttactac caccaaatct actgctggta cgggtgtaga cgccgcggcg 780 
caggctactg attcagctaa aaaacgtgat gcgttagctg ccacccttca tgctgatgtg 840 
ggtaaatctg ttaatggttc ttacaccaca aaagatggta ctgtttcttt cgaaacggat 900 
tcagcaggta atatcaccat cggtggaagc caggcatacg tagacgatgc aggcaacttg 960 
acgactaaca acgctggtag cgcagctaaa gctgatatga aagcgctgct taaagccgcg 102 0 
agcgaaggta gtgacggtgc ttctctgaca ttcaatggca ctgaatatac tatcgcaaaa 1080 
gcaactcctg cgacaacctc tccagtagct ccgttaatcc ctggtgggat tacttatcag 1140 
gctacagtga gtaaagatgt agtattgagc gaaaccaaag cggctgccgc gacatcttca 1200 
attaccttta attccggtgt actgagcaaa actattgggt ttaccgcggg tgaatccagt 1260 
gatgctgcga agtcttatgt ggatgataaa ggtggtatta ctaacgttgc cgactataca 1320 
gtctcttaca gcgttaacaa ggataacggc tctgtgactg ttgccgggta tgcttcagcg 13 80 
actgatacca ataaagatta tgctccagca attggtactg ctgtaaatgt gaactccgcg 1440 
ggtaaaatca ctactgagac taccagtgct ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ctggacgacg ctatcagctc catcgacaaa ttccgttctt ccctgggtgc tatccagaac 1560 
cgtctggatt ccgcagtcac caacctgaac aacaccacta ccaacctgtc tgaagcgcag 1620 
tcccgtattc aggacgccga ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatt 1680 
atccagcagg ccggtaactc cgtgctggca aaagccaacc aggtaccgca gcaggttctg 1740 
tctctgctgc agggttaa 1758 

<210> 10 
<211> 1383 
<212> DNA 

<213> Escherichia coli 



caggttctgt ctctgctgca gggttaa 



1647 
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<400> 10 

aacaaatctc agtcttctct tagctctgct attgagcgtc tgtcttctgg tctgcgtatt 60 
aacagcgcaa aagacgatgc agcaggtcag gcgattgcta accgttttac ggcaaatatt 120 
aaaggtctga cccaggcttc ccgtaacgca aatgatggta tttctgttgc gcagaccact 180 
gaaggtgcgc tgaatgaaat taacaacaac ctgcagcgta ttcgtgaact ttctgttcag 240 
gcaactaacg gtactaactc tgacagcgat ctttcttcta tccaggctga aattactcaa 300 
cgtctggaag aaattgaccg tgtatctgag caaactcagt ttaacggcgt gaaagtcctt 360 
gctgaaaata atgaaatgaa aattcaggtt ggtgctaatg atggtgaaac catcactatc 420 
aatctggcaa aaattgatgc gaaaactctc ggcctggacg gttttaatat cgatggcgcg 480 
cagaaagcaa caggcagtga cctgatttct aaatttaaag cgacaggtac tgataattat 540 
gatgttggcg gtaaaactta taccgtgaat gtggagagcg gcgcggttaa gaatgatgct 600 
aataaagatg tttttgtaag cgcagctgat ggatcgctga cgaccagtag tgatactaaa 660 
gtatccggtg aaagtattga tgcaacagaa ctagcgaaac ttgcaataaa attagctgac 72 0 
aaaggctcca ttgaatacaa gggcattaca tttactaaca acactggcgc agagcttgat 780 
gctaatggta aaggtgtttt gaccgcaaat attgatggtc aagatgttca atttactatt 840 
gacagtaatg cacccacggg tgccggcgca acaataacta cagacacagc tgtttacaaa 900 
aacagtgcgg gccagttcac cactacaaaa gtggaaaata aagccgcaac actctctgat 960 
ctggatctta atgcagccaa gaaaacaggt agcactttag ttgtaaatgg cgccacctac 1020 
aatgtcagcg cagatggtaa aacggtaact gatactactc ctggtgcccc taaagtgatg 1080 
tatctgagca aatcagaagg tggtagcccg attctggtaa acgaagatgc agcaaaatcg 1140 
ttgcaatcta ccaccaaccc gctcgaaact atcgacaagg cattggctaa agttgacaat 1200 
ctgcgttctg acctcggtgc agtacaaaac cgtttcgact ctgccatcac caaccttggc 1260 
aacaccgtaa acaacctgtc ttctgcccgt agccgtatcg aagatgctga ctacgcgacc 1320 
gaagtgtcta acatgtctcg tgcgcagatc ctgcaacaag cgggtacctc tgttctggcg 13 80 
cag 1383 



<210> 11 
<211> 2013 
<212> DNA 

<213> Escherichia coli 



<400> 11 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaattaacaa caacttacag 
accgggacta actccgattc ggatctggac 
gacgaaattg accgcgtatc cggccagacc 
gatggctcga tgaaaattca ggtcggcgcg 
aagaaaattg actctgatac gctgaatctg 
gtagcgaata cagctgcgac aagcgacgat 
acagatacca atggcgtgac cgcgtataca 
tccgatctgt tagctaatat caccgatgga 
tttggcgtgg ctgcaaagaa tggttacacc 
gctgcagatg gtgccgattc agcgaagacg 
tcgtcgcagg cgacagtgac tattggtggt 
ggaaaaatta ctgcggcaga tgataat.gcg 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 12 0 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttccg ttgcacagac cactgaaggc 240 
cgtattcgtg aactgacggt tcaggcttct 3 00 
tccattcagg acgaaatcaa atcccgtctg 360 
cagttcaacg gcgtgaacgt gctgtccaaa 420 
aacgatggcg aaacgattac tattgatctg 480 
gctggtttta acgttaacgg taaaggttct 540 
ttaaaact§g ctggtttcac taagggcacc 600 
aacacaatta gtaatgacaa agccaaagct 660 
tcagtgatca ctgggggagg ggcaaacgct 720 
tatgatgcag caagtaaatc ttatagtttt 780 
ttaagcatca ttaatccaaa caccggtgat 840 
aaagagcaga aagttaatat ttcccaggat 900 
acgctgtatt tagataaaca gggaaacttg 960 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 



wo 99/61458 - 7 - PCT/AU99/00385 

acaaaaacga atgcaggtaa cgataccgca gcgacttggg atggtttaat ttccaacagc 102 0 
gattctaccg gtgcggtt.cc agttggggtt gcaactacaa ttacaattac ttctggtaca 1080 
gcttccggaa tgtctgttca gtccgcagga gcaggaattc agacctcaac aaattctcag 1140 
attcttgcag gtggtgcatt tgcggctaag gtaagtattg agggaggcgc tgctacagac 1200 
attttggtag caagtaatgg aaacataaca gcggctgatg gtagtgcact ttatcttgat 1260 
gcgactactg gtggattcac tacaacggct ggaggaaata cagctgcttc gttagataat 1320 
ttaattgcta acagtaagga tgctacctta accgtaactt caggtaccgg ccagaacact 13 80 
gtttatagca caacaggaag tggcgctcag ttcaccagtt tagcaaaagt agacacagtc 1440 
aatgtcacca acgcacatgt cagtgccgaa ggtatggcaa atctgacaaa aagcaatttt 1500 
accattgata tgggcggtac aggtacagta acttacacag tttccaatgg ggatgtgaaa 1560 
gctgctgcaa atgctgatgt ttatgtcgaa gatggtgcac tttcagccaa tgctacaaaa 1620 
gatgtaacct actttgaaca aaaaaatggg gctattacca acagcaccgg tggtaccatc 1680 
tatgaaacag ctgatggtaa gttaacaaca gaagctacta ctgcatccag ttccaccgcc 1740 
gatcccctga aagctctgga cgaagccatc agctccatcg acaaattccg ctcctccctc 18 00 
ggtgcggtgc aaaaccgtct ggattccgcg gtcaccaacc tgaacaacac cactaccaac 1860 
ctgtccgaag cgcagtcccg tattcaggac gccgactatg cgaccgaagt gtccaacatg 1920 
tcgaaagcgc agatcatcca gcaggccggt aactccgtgc tggcaaaagc taaccaggta 1980 
ccgcagcagg ttctgtctct gctgcagggt taa 2013 



<210> 12 
<211> 1263 
<212> DNA 

<213> Escherichia coli 



<400> 12 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaattaacaa caacttacag 
accggtacta actctgagtc tgacctgtct 
gaagagattg atcgtgtttc aagtcagact 
gatgggaaaa tgaacattca ggttggggca 
aaaaagatcg attcatctac actaaacctc 
agtgttaaag atggggccac catcaataag 
gataaagctt caggatcgtt aggtacccta 
gtaaatgaca ctaaaagtag taagtactac 
attaacttca actctacaaa tgaaagtgga 
actgttggcc gcgatgtaaa attggatgct 
gtgtataaag ataaaagcgg caatgatgct 
aatcaatcaa ctttcaatgc cgctaatatc 
tctacaaccg tcgccaagcaa tttaacagct 
gcatctgttg ataaattccg ctcttctctc 
attgccaacc tgaacaacac cactaccaac 
gctgactatg cgaccgaagt gtccaacatg 
aactccgtgc tggcaaaagc caaccaggta 
taa 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcgcagac caccgaaggc 240 
cgtgtgcgtg agctgactgt tcaggcgacc 3 00 
tctatccagg acgaaatcaa atctcgcctg 3 60 
caatttaacg gcgtgaatgt tttggctaaa 420 
aatgatggac agactatcac tattgatctg 480 
tccagttttg atgctacaaa cttgggcacc 540 
caagtggcag taggtgctgg cgactttaaa 600 
aaattagttg agaaagacgg taagtactat 660 
gatgccgaag tagatactag taagggtaaa 720 
actactccta ctgcagcgac ggaagtaact 780 
tctgcactta aagccaacca atcgcttgtc 840 
tatatcattc agaccaaaga tgtaacaact 900 
agtgatgctg gtgttttatc tattggtgca 960 
aacccgctta aggctcttga tgatgcaatt 102 0 
ggtgccgttc agaaccgtct ggattctgcc 1080 
ctgtctgaag cgcagtcccg tattcaggac 1140 
tcgaaagcgc agattatcca gcaggccggt 1200 
ccgcagcagg ttctgtctct gctgcagggt 12 60 

1263 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 



wo 99/61458 ^ PCT/AU99/00385 

— o — 

<210> 13 
<211> 1368 
<212> DNA 

<213> Escherichia coli 
<400> 13 

aacaaatctc agtcttctct gagctccgcc attgaacgtc tctcttctgg cctgcgtatt 60 
aacagtgcta aagatgacgc agcaggtcag gcgattgcta accgttttac agcaaatatt 120 
aaaggtctga ctcaggcttc ccgtaacgcg aatgatggta tttctgttgc gcagaccact 180 
gaaggtgcgc tgaatgaaat taacaacaac ctgcagcgtg tacgtgaact gactgttcag 240 
gcaactaacg gtactaactc tgacagcgat ctttcttcta tccaggctga aattactcaa 300 
cgtctggaag aaattgaccg tgtatctgag caaactcagt ttaacggcgt gaaagtcctt 360 
gctgaaaata atgaaatgaa aattcaggtt ggtgctaatg atggtgaaac catcactatc 420 
aatctggcaa aaattgatgc gaaaactctc ggcctggacg gttttaatat cgatggcgcg 480 
cagaaagcaa ctggcagtga cctgatttct aaatttaaag cgacaggtac tgataactat 540 
gatgttggcg gtgatgctta tactgttaac gtagatagcg gagctgttaa agatactaca 600 
gggaatgata tttttgttag tgcagcagat ggttcactga caactaaatc tgacacaaac 660 
atagctggta cagggattga tgctacagca ctcgcagcag cggctaagaa taaagcacag 720 
aatgataaat tcacgtttaa tggagttgaa ttcacaacaa caactgcagc ggatggcaat 780 
gggaatggtg tatattctgc agaaattgat ggtaagtcag tgacatttac tgtgacagat 840 
gctgacaaaa aagcttcttt gattacgagt gagacagttt acaaaaatag cgctggcctt 900 
tatacgacaa ccaaagttga taacaaggct gccacacttt ccgatcttga tctcaatgca 960 
gctaagaaaa caggaagcac gttagttgtt aacggtgcaa cttacgatgt tagtgcagat 1020 
ggtaaaacga taacggagac tgcttctggt aacaataaag tcatgtatct gagcaaatca 1080 
gaaggtggta gcccgattct ggtaaacgaa gatgcagcaa aatcgttgca atctaccacc 1140 
aacccgctcg aaactatcga caaagcattg gctaaagttg acaatctgcg ttctgacctc 1200 
ggtgcagtac aaaaccgttt cgactctgct atcaccaacc ttggcaacac cgtaaacaac 1260 
ctgtcttctg cccgtagccg tatcgaagat gctgactacg cgaccgaagt gtctaacatg 1320 
tctcgtgcgc agatcctgca acaagcgggt acctctgttc tggcgcag 1368 

<210> 14 
<211> 1788 
<212> DNA 

<213> Escherichia coli 
<400> 14 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgcagcggg tcaggcgatt gctaaccgtt tcacctctaa cattaaaggc 180 
ctgactcagg cggcccgtaa cgccaacgac ggtatctccg ttgcgcagac caccgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtatccgtg aactgacggt tcaggcttct 300 
accgggacta actccgattc ggatctggac tccattcagg acgaaatcaa atcccgtctg 360 
gacgaaattg accgcgtatc tggccagacc cagttcaacg gcgtgaacgt actggcgaaa 420 
gacggttcaa tgaaaattca ggttggtgcg aatgacggcc agactatcac gattgatctg 480 
aagaaaattg actcagatac gctggggctg aatggtttta acgtgaatgg ttccggtacg 540 
atagccaata aagcggcgac cattagcgac ctgacagcag cgaaaatgga tgctgcaact 600 
aatactataa ctacaacaaa taatgcgctg actgcatcaa aggcgcttga tcaactgaaa 660 
gatggtgaca ctgttactat caaagcagat gctgctcaaa ctgccacggt ttatacatac 720 
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aatgcatcag ctggtaactt ctcattcagt aatgtatcga ataatacttc agcaaaagca 780 
ggtgatgtag cagctagcct tctcccgccg gctgggcaaa ctgctagtgg tgtttataaa 840 
gcagcaagcg gtgaagtgaa ctttgatgtt gatgcgaatg gtaaaatcac aatcggagga 900 
cagaaagcat atttaactag tgatggtaac ttaactacaa acgatgctgg tggtgcgact 960 
gcggctacgc ttgatggttt attcaagaaa gctggtgatg gtcaatcaat cgggtttaag 102 0 
aagactgcat cagtcacgat ggggggaaca acttataact ttaaaacggg tgctgatgct 1080 
gatgctgcaa ctgctaacgc aggggtatcg ttcactgata cagctagcaa agaaaccgtt 1140 
ttaaataaag tggctacagc taaacaaggc aaagcagttg cagctgacgg tgatacatcc 1200 
gcaacaatta cctataaatc tggcgttcag acgtatcagg ctgtatttgc cgcaggtgac 1260 
ggtactgcta gcgcaaaata tgccgataaa gctgacgttt ctaatgcaac agcaacatac 1320 
actgatgctg atggtgaaat gactacaatt ggttcataca ccacgaagta ttcaatcgat 13 80 
gctaacaacg gcaaggtaac tgttgattct ggaactggta cgggtaaata tgcgccgaaa 1440 
gtaggggctg aagtatatgt tagtgctaat ggtactttaa caacagatgc aactagcgaa 1500 
ggcacagtaa caaaagatcc actgaaagct ctggatgaag ctatcagctc catcgacaaa 1560 
ttccgttctt ccctgggtgc tatccagaac cgtctggatt ccgcagtcac caacctgaac 1620 
aacaccacta ccaacctgtc cgaagcgcag tcccgtattc aggacgccga ctatgcgacc 1680 
gaagtgtcca acatgtcgaa agcgcagatc attcagcagg ccggtaactc cgtgctggca 1740 
aaagccaacc aggtaccgca gcaggttctg tctctgctgc agggttaa 1788 

<210> 15 
<211> 1653 
<212> DNA 

<213> Escherichia coli 



atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttccg ttgcgcagac cactgaaggt 240 
gcgctgtccg aaatcaacaa caacttacag cgtattcgtg agctgacggt tcaggcttct 3 00 
accgggacta actccgattc tgacctggac tccatccagg acgaaatcaa gtctcgtctg 3 60 
gacgaaattg accgcgtatc cggtcagacc cagttcaacg gcgtgaacgt gctggcgaaa 42 0 
gacggttcga tgaaaattca ggttggtgcg aatgacggcc agactatcac gattgatctg 480 
aagaaaattg actcagatac gctggggctg agtgggttta atgtgaatgg tggcggggct 540 
gttgctaaca ctgctgcatc taaagctgac ttggtagctg ctaatgcaac tgtggtaggc 600 
aacaaatata ctgtgagtgc gggttacgat gctgctaaag cgtctgattt gctggctgga 660 
gttagtgatg gtgatactgt tcaggcaacc attaataacg gcttcggaac ggcggctagt 72 0 
gcaacgaatt acaagtatga cagtgcaagt aagtcttact cttttgatac cacaacggct 78 0 
tcagctgccg atgttcagaa atatttgacc ccgggcgttg gtgataccgc taagggcact 840 
attactatcg atggttctgc acaggatgtt cagatcagca gtgatggtaa aattacgtca 900 
agcaatggag ataaacttta cattgataca actgggcgct taacgaaaaa cggctttagt 960 
gcttctttga ctgaggctag tctgtccaca cttgcagcca ataataccaa agcgacaacc ^020 
attgacattg gcggtacctc tatctccttt accggtaata gtactacgcc gaacactatt 1080 
acttattcag taacaggtgc aaaagttgat caggcagctt tcgataaagc tgtatcaacc 1140 
tctggaaacg atgttgattt cactaccgca ggttatagcg tcgacggcgc aactggcgct 1200 
gtaacaaaag gtgttgctcc ggtttatatt gataacaacg gggcgttgac cacatctgat 12 60 
actgtagatt tttatctaca ggatgatggt tcagtgacta acggcagcgg taaggcagtt 1320 
tataaagatg ctgacggtaa attgacgaca gatgctgaaa ctaaagctgc aaccaccgcc 13 80 



<400> 15 
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gatcccctga aagctctgga cgaagccatc agctccatcg acaaattccg ctcctccctc 1440 
ggtgcggtgc agaaccgtct ggattccgcg gtcaccaacc tgaacaacac cactaccaac 1500 
ctgtctgaag cgcagtcccg tattcaggac gctgactatg cgaccgaagt atccaacatg 1560 
tcgaaagcgc agatcatcca gcaggccggt aactccgtgc tggcaaaagc taaccaggta 1620 



<210> 16 
<211> 1689 
<212> DNA 

<213> Escherichia coli 
<400> 16 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 12 0 
gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcacagac cactgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtgtgcgtg aactgaccgt tcaggcaacc 3 00 
accggtacca actcccagtc tgacctggac tctatccagg acgaaattaa atcccgtctg 3 60 
gacgaaattg atcgcgtatc cggtcagacc cagttcaacg gcgtgaacgt gctggcaaaa 420 
gacggttcca tgaaaattca ggttggcgcg aacgatggcc agaccatcac tatcgacctg 480 
aagaagattg actcttctac cttgaacctg acaggtttta acgttaacgg ttctggttct 540 
gtggcgaata ctgcagcaac taaagctgat ttaaccgctg ctcaactctc tgcaccgggt 600 
gcagcagacg caaatggtac agttacttat actgtcagtg ctggttataa agaatccact 660 
gctgcagatg ttattgctag catcaaagac ggcagtgctc cgacttctgc aattactgca 720 
accattaata atggcttcgg tgattccagt gcgctgactt ccaatgacta tacttatgac 7 80 
ccagcaaaag gcgacttcac ttacgacgta gcttcaagcg ccaataatac tgctgcccag 840 
gttcagtcct tcctgacgcc gaaagcaggt gataccgcaa atctgaaagt aaccgttggt 900 
acgacatcgg ttgatgtcgt tctggccagt gatggtaaga ttacagcaaa agatggttct 960 
gcattatata tcgacagtac aggtaacctg actcagaaca gtgctggctt gacctctgct 1020 
aaactggcta ctctgactgg ccttcagggc tctggtgttg cttcaaccat cactactgaa 1080 
gatggcacta atattgatat tgctgctaac ggtaatattg gtctgaccgg tgttcgtatc 1140 
agtgctgatt ctctgcagtc agcgactaaa tctacgggct ttactgttgg tactggcgct 1200 
acaggtctga ccgtaggtac tgatggtaaa gtgactatcg gcgggactac tgctcagtcc 1260 
tacaccagca aagatggttc cctgactact gataacacca ctaaactgta tctgcagaaa 1320 
gatggctctg taaccaacgg ttcaggtaaa gcggtctatg tagaagcgga tggtgatttc 13 80 
actaccgacg ctgcaaccaa agccgcaacc accaccgatc cgctgaaagc cctggatgag 1440 
gcaatcagcc agatcgataa gttccgttca tccctgggtg ctatccagaa ccgtctggat 1500 
tccgcggtca ccaacctgaa caacaccact accaacctgt ctgaagcgca gtcccgtatt 1560 
caggacgccg actatgcgac cgaagtgtcc aacatgtcga aagcgcagat cattcagcag 1620 
gccggtaact ccgtgctggc aaaagccaac caggtaccgc aacaggttct gtctctgctg 1680 
cagggctaa 1689 

<210> 17 
<211> 915 
<212> DNA 

<213> Escherichia coli 



ccacagcagg ttctgtctct gctgcagggt taa 



1653 



<400> 17 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 
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gcgctgtcga cttctatcga gcgcctctct tctggtctgc gtattaacag cgctaaagat 60 
gacgctgcgg gccaggcgat tgctaaccgc ttcacttcta acatcaaagg tctgactcag 120 
gccgcacgta acgccaacga cggtatttct ctggcgcaga cggctgaagg cgcgctgtca 180 
gagattaaca acaacttgca gcgtattcgt gaactgaccg ttcaggcctc taccggcacg 240 
aactctgatt ccgacctgtc ttctattcag gacgaaatca aatcccgtct tgatgaaatt 3 00 
gaccgtgtat ctggtcagac ccagttcaac ggtgtgaacg tgctgtcgaa aaacgattcg 3 60 
atgaagattc agattggtgc caatgataac cagacgatca gcattggctt gcaacaaatc 420 
gacagtacca ctttgaatct gaaaggattt accgtgtccg gcatggcgga tttcagcgcg 480 
gcgaaactga cggctgctga tggtacagca attgctgctg cggatgtcaa ggatgctggg 540 
ggtaaacaag tcaatttact gtcttacact gacaccgcgt ctaacagtac taaatatgcg 600 
gtcgttgatt ctgcaaccgg taaatacatg gcagccactg tagtcattac cagtacggcg 660 
gcggcggtaa ctgttggtgc aacggaagtg gcgggagccg ctacagccga accgttaaaa 72 0 
gcactggatg ccgcaatcgc taaagtcgac aaattccgct cctccctcgg tgccgttcaa 780 
aaccgtctgg attctgcggt caccaacctg aacaacacca ccaccaacct gtctgaagcg 840 
cagtcccgta ttcaggacgc cgactatgcg accgaagtgt ccaacatgtc gaaagcgcag 900 
attatccagc aggcg 915 



<210> 18 
<211> 1665 
<212> DNA 

<213> Escherichia coli 



<400> 18 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaatgac 
gcgctgtccg aaatcaacaa caacttacag 
acagggacta actccgattc tgacctggac 
gacgaaattg accgcgtatc cggtcagacc 
gatggttcaa tgaaaattca ggtcggcgca 
aagaaaattg actctgatac gctgaatctg 
acagccaata ctgctgcaac acttaaagat 
gtcactacag ctggagttaa tagatatatt 
attttgaatg cggtagctgg tgttgatggc 
tttggtgcag ctgcccctgg tacgccagtg 
tatacggctt ctgcttcagt tgatgcgact 
ggtggtacca ctgctgcaac agtaagtatt 
gtcattattg ctaaagatgg ttctttaact 
gatgatactg gtaacttaag taaaactaac 
gacttaatgg caaacaatgc taatgccaaa 
actgctaata cgacaaagtt tgatggggta 
aacgccgtta aaaatgagac ttacactgca 
acagtcaata atggcactgc tgcatcagcg 
cctgccgagt attttgctca agctgatggc 
agtaaagcta tctatgtaag tgccaatggt 
gaagctacta ccaacccgct ggcagcattg 
cgttcttccc tgggtgctat ccagaaccgt 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa tattaaaggc 180 
ggtatttctg ttgcacagac cactgaaggc 240 
cgtattcgtg aactgacggt tcaggccact 300 
tccatccagg acgaaatcaa atctcgtctg 360 
cagttcaacg gcgtgaacgt gctgtccaaa 420 
aatgatggtg aaaccatcac gattgatctg 480 
gctggtttta acgtgaatgg cgaaggtgaa 540 
atggttggtt taaaactcga taatacgggg 600 
gctgacaaag ccgtcgcaag tagcacggat 660 
agtaaagttt ccacggaggc agatgttggt 720 
gaatatactt atcataaaga tactaacaca 780 
caactggcgg cattcctgaa tcctgaagcg 840 
ggcaacggta caacagctca agagcaaaaa 900 
gctgctgatg acggtgccgc tctctatctt 960 
gcaggcactg atactcaagc taaactgtct 1020 
acagtcatta caacagataa aggtacattt 1080 
gatatttctg ttgatgcttc aacgtttgct 1140 
actgttggtg taactttacc tgcgacatat 1200 
tatttagtcg atggaaaagt gagcaaaact 1260 
actattacta gtggtgaaaa tgcggctacc 1320 
aacttaacga ctaatacaac tagtgaatct 1380 
gatgacgcta tcgcgtctat cgacaaattc 1440 
ctggattccg cagtcaccaa cctgaacaac 1500 
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accactacca acctgtctga agcgcagtcc cgtattcagg acgccgacta tgcgaccgaa 1560 
gtgtccaaca tgtcgaaagc gcagatcatt cagcaggccg gtaactccgt gctggcaaaa 162 0 



<210> 19 
<211> 1842 
<212> DNA 

<213> Escherichia coli 
<400> 19 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcgcagac cactgaaggc 240 
gcgctgtccg aaattaacaa caacttacag cgtattcgtg aactgacggt tcaggcgacg 3 00 
accggaacta actccacctc tgacctggac tccatccagg acgaaatcaa atcccgtctt 3 60 
gacgaaattg accgcgtatc tggtcagacc cagttcaacg gcgtgaacgt gctgtctaaa 420 
gatggctcga tgaaaattca ggtcggcgcg aacgatggcg aaacgattac tattgatctg 480 
aagaaaattg actctgatac gctgaatctg gctggtttta acgttaacgg taaaggttct 540 
gtagcgaata ccgctgcgac tacagataat ctgacattgg ctggttttac agcgggtact 600 
aaagctgctg atggcaccgt aacttatagc aaaaatgtcc agtttgccgc cgcgactgca 660 
agcaatgtac tggctgctgc taaagatggc gacgaaatta cgttcgctgg taataacggc 720 
acaggtatag ctgcaactgg ggggacttat acttatcata aggactctaa ctcatacagc 780 
tttagcgcaa cggctgcatc taaagattct ctgttgagca cactggcacc aaacgctggc 840 
gatacattta ccgctaaagt gactattggt tctaaatcgc aagaagttaa cgttagcaaa 900 
gatggtacga ttacatccag cgatggtaag gcgctgtatt tagatgagaa gggcaacctg 960 
acccaaacag gtagtggcac aaccaaagct gcaacctggg ataacctgat ggccaataca 1020 
gatactacag gcaaagatgc ctatggtaac tctgcggcag cagctgttgg gacagtaatc 1080 
gaagcaaaag gaatgaccat cacttctgct ggtggtaatg ctcaggtgtt aaaagacgcg 1140 
gcttataatg ccgcatatgc gacctcaatt actactggta ctccgggtga tgcgggagcc 1200 
gcgggagccg ctgcaactgc gggtaatgcc gcggtgggag cgctgggcgc aacggcagtt 1260 
gataatacca cggcagatgt tgccgatatc tc tatctcag c ttcgcaaat ggcgagcatc 1320 
cttcaggata aagatttcac cttaagtgat ggtagtgata cttacaacgt gaccagcaat 1380 
gctgtcacta tcaatggcaa agcagcaaac attgatgaca gcggcgcaat cacagaccaa 1440 
accagtaaag ttgtcaatta tttcgctcat actaacggta gcgtgactaa cgatacaggc 1500 
tccactattt atgcgacaga agatggtagc ctgaccaccg atgcagcaac caaagccgaa 1560 
accaccgccg atcccctgaa agctctggac gaagccatca gctccatcga caaattccgc 1620 
tcctccctcg gtgcggtgca aaaccgtctg gattccgcgg tcaccaacct gaacaacacc 1680 
accaccaacc tgtctgaagc gcagtcccgt attcaggacg ccgactatgc gaccgaagtg 1740 
tccaacatgt cgaaagcgca gattatccag caggccggta actccgtgct ggcaaaagct 1800 
aaccaggtac cacagcaggt tctgtctctg ctgcagggtt aa 1842 

<210> 20 
<211> 1731 
<212> DNA 

<213> Escherichia coli 



gccaaccagg taccgcagca ggttctgtct ctgctgcagg gttaa 



1665 



<400> 20 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 



wo 99/61458 _ ^ ^ PCT/AU99/00385 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg cggcccgtaa cgccaacgac ggtatttctg ttgcgcagac caccgaaggc 240 
gcgctgtccg aaattaacaa caacttacag cgtgtgcgtg agctgactgt tcaggcgacc 3 00 
accggtacca actcccagtc tgatctggac tctatccagg acgaaatcaa atcccgtctg 3 60 
gacgaaattg accgcgtatc cggtcagacc cagttcaacg gcgtgaacgt gctggcaaaa 420 
gacggttcca tgaaaattca ggttggcgcg aatgatggcc agaccatcac tatcgacctg 480 
aagaagattg actcttctac gttgaaactg actggtttta acgtgaatgg ttctggttct 540 
gtggcgaata ctgcggcgac taaagcggat ttggctgctg ctgcaattgg tacccctggg 600 
gcagcagatt ctacaggtgc cattgcttac acagtaagtg ctgggctgac taaaactaca 660 
gccgcagatg tactgtctag cctcgctgat ggtacgacta ttacagccac aggcgtgaaa 720 
aatggctttg ctgcaggagc cacttccaat gcctataaac ttaacaaaga taataataca 780 
tttacttatg acacgactgc tacgacagct gagctgcagt cttacctgac tccgaaagcg 840 
ggcgacactg caacattcag tgttgaaatt ggtggtacta cacaagacgt cgtgctgtcc 900 
agtgatggca aactcactgc taaggatggc tctaagcttt acattgatac aactggtaat 960 
ttaactcaga atggtggtaa taacggtgtt ggaacactcg cggaagcgac tctgagtggt 1020 
ttagctctga acaaaaatgg tttaacggct gttaaatcca caattactac agctgataac 1080 
acttcgattg tactgaatgg ttcaagcgat ggtactggta atgctggtac tgaaggtacg 1140 
attgctgtta caggcgctgt aattagttca gctgctctgc aatctgcaag caaaacgact 1200 
ggtttcactg ttggtacagt agacacagct ggttatatct ctgtaggtac tgatgggagt 12 60 
gttcaggcat atgatgctgc gacttctggc aacaaagctt cttacaccaa cactgacggt 1320 
acactgacta ctgataacac cactaaactg tatctgcaga aagatggctc tgtaaccaac 13 80 
ggttcaggta aagcggtcta tgtagaagcg gatggtgatt tcactaccga cgctgcaacc 1440 
aaagccgcaa ccaccaccga tccgctggcc gctctggatg acgcaatcag ccagatcgac 1500 
aagttccgtt catccttggg tgctatccag aaccgtctgg attctgcagt caccaacctg 1560 
aacaacacca ccaccaacct gtctgaagcg cagtcccgta ttcaggacgc cgactatgcg 1620 
accgaagtgt ccaatatgtc gaaagcgcag atcatccagc aggccggtaa ctccgtgctg 1680 
gcaaaagcca accaggtacc gcagcaggtt ctgtctctgc tgcagggtta a 1731 

<210> 21 
<211> 1380 
<212> DNA 

<213> Escherichia coli 
<400> 21 

aacaaatctc agtcttctct gagctccgcc attgaacgtc tctcttctgg cctgcgtatt 60 

aacagtgcta aagatgacgc agcaggtcag gcgattgcta accgttttac agcaaatatt 120 

aaaggtctga ctcaggcttc ccgtaacgcg aatgatggta tttctgttgc gcagaccact 180 

gaaggtgcgc tgaatgaaat taacaacaac ctgcagcgta ttcgtgaact ttctgttcag 240 

gcaactaacg gtactaactc tgacagcgat ctttcttcta tccaggctga aattactcaa 300 

cgtctggaag aaattgaccg tgtatctgag caaactcagt ttaacggcgt gaaagfccctt 3 60 

gctgaaaata atgaaatgaa aattcaggtt ggtgctaatg atggtgaaac catcactatc 420 

aatctggcaa aaattgatgc gaaaactctc ggcctggacg gttttaatat cgatggcgcg 480 

cagaaagcaa ccggcagtga cctgatttct aaatttaaag cgacaggtac tgataattat 540 

caaattaacg gtactgataa ctatactgtt aatgtagata gtggcgtagt acaggataaa 600 

gatggcaaac aagtttatgt gagtactgcg gatggttcac ttacgaccag cagtgatact 660 

caattcaaga ttgatgcaac taagcttgca gtggctgcta aagatttagc tcaagggaat 720 

SUBSTITUTE SHEET (Rule 26) (RO/AU) 
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aagattgtct acgaaggtat cgaatttaca aataccggca ctgtcgctat agatgccaaa 780 

ggtaatggta aattaaccgc caatgttgat ggtaaggctg ttgaattcac tatttcgggg 840 

agtactgata catcaggtac tagtgcaacc gttgccccta cgacagccct atacaaaaat 900 

agtgcagggc aattgactgc aacaaaagtt gaaaataaag cagcgacact atctgatctt 960 

gatctgaacg ctgccaagaa aacaggaagc acgttagttg ttaacggtgc aacttacgat 1020 

gttagtgcag atggtaaaac gataacggag actgcttctg gtaacaataa agtcatgtat 1080 

ctgagcaaat cagaaggtgg tagcccgatt ctggtaaacg aagatgcagc aaaatcgttg 1140 

caatctacca ccaacccgct cgaaactatc gacaaagcat tggctaaagt tgacaatctg 1200 

cgttctgacc tcggtgcagt acaaaaccgt ttcgactctg ccatcaccaa ccttggcaac 1260 

accgtaaaca acctgtcttc tgcccgtagc cgtatcgaag atgctgacta cgcgaccgaa 132 0 

gtgtctaaca tgtctcgtgc gcagatcctg caacaagcgg gtacctctgt tctggcacag 13 80 



<210> 22 
<211> 1767 
<212> DNA 

<213> Escherichia coli 



<400> 22 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgcagcggg tcaggcgatt 
ctgactcagg cggcacgtaa cgccaacgac 
gcgctgtctg aaatcaacaa caacttacag 
accggtacta actccgactc cgacctggct 
gatgaaattg accgcgtatc tggtcagact 
gacggttcca tgaaaattca ggtaggtgct 
aaaaaaatcg actctgatac tctgggcctg 
attaccaaca aagcagcaac tgtcagtgat 
ggtgcctatg atataaaaac cactaacaca 
ttgaatgatg gtgatgttgt tactatcaat 
gctgctacag gtgggtttac gacggatgtc 
gctactgcta ataaaactgc ccgtgatgca 
aaaactgtta atggttcttg gactacgaat 
gatggtaaga tttctattgg tggtgttgct 
actaacgcag caggtatgac gactcaagca 
tctgctactg gtaagggtgg atccctgacc 
ggtacggctg gggttgatcc tgatgacgct 
tctaaatcag taagcaagga tgttgttctt 
acagttgatt tcaactccgg tatcatgact 
actgatacat tcaaagatgc agatggtgct 
tatgctgtaa ataaagatac tggtgaagtt 
gccgataagg ctgttgatga tactaa^tat 
aattctgcag gtaaattgac cactgatacc 
ctggctgccc tggacgctgc tatcagctcc 
atccagaacc gtctggattc cgcagtcacc 
gaagcgcagt cccgtattca ggacgccgac 
gcgcagatta tccagcaggc cggtaactcc 
caggttctgt ctctgctaca gggttaa 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatctctc tggcgcagac caccgaaggt 240 
cgtgtacgtg aactgaccgt tcaggcaacc 3 00 
tctattcagg acgaaatcaa atcccgtctg 3 60 
cagttcaacg gcgtgaacgt gctggcaaaa 42 0 
aacgacggcc agactatcac tattgacctg 480 
aatggtttta acgtgaatgg ttctgggacg 540 
gttactcgcg caggcggtac attggtgaat 600 
gcgctgacta caactgatgc cttcgcgaaa 660 
aatggtaagg atactgccta taaatataat 720 
tccatctccg gggatcctac cgctgctgac 780 
cttgcggcgt ctttacatgc tgagccgggt 840 
gatggtacgg taaaatttga taccgatgcc 900 
gcttatgtag atgcagcagg caacctgacc 960 
acaactaccg atttggttac tgctgctgca 1020 
tttggtgaca cgacgtataa aattggtcag 1080 
tcagatgatg tactgggcac catttcttac 1140 
gctgatacta aagcaactgg taacacgaca 1200 
tcaaaggtta gtttcgatgc aggtacatca 1260 
atcaccaaaa ctaaagaata caccacttct 1320 
accgttgctg attatgctgc ggtagatagc 13 80 
aaaccgacta tcggcgcgac agttaacctg 1440 
accagtgcag gcacagcaac caaagatcct 1500 
atcgacaaat tccgttcatc cctgggtgct 1560 
aacctgaaca acaccactac caacctgtcc 1620 
tatgcgaccg aagtgtccaa catgtcgaaa 1680 
gtgctggcaa aagccaacca ggtaccgcag 1740 

1767 
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<210> 23 
<211> 1383 
<212> DNA 

<213> Escherichia coli 



<400> 23 

aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tttcttctgg tctgcgtatt 60 
aacagcgcta aagatgacgc tgcgggccag gcgattgcta accgcttcac ttctaacatc 120 
aaaggtctga ctcaggccgc acgtaacgcc aacgacggta tttctctggc gcagaccact 180 
gaaggcgcgc tgtctgagat taacaacaac ttgcagcgtg tgcgtgagtt gactgtacag 240 
gcgacgaccg ggactaactc tgattctgac ctgtcttcta tccaggatga aatcaaatcc 300 
cgtttaagcg aaattgaccg tgtatctggt cagactcagt ttaacggcgt gaacgtactg 360 
gctaagaatg acaccctgtc tattcaggta ggtgcaaatg acggtcagac tatcaatatt 420 
gacctgcagc aaatcgattc tcatacactg ggtctggatg gtttcagcgt taaaaataat 480 
gatgcagtga aaaccagtgc tgccgtgaat actcttgggg ggggggcagg ttctgttgct 540 
gtcgacttcg caacaaccag tttgactgct atcactggtc tcggtagcgg tgctatcagc 600 
gaaattgcta aagacgataa tggtgattac tacgcgcatg tcacagggac tacgggtaat 660 
actgctgatg gttactatgc tgtcgatatc gacaaggcta ccggtgaggt cgctctgaaa 720 
gatggtaacg tagatacacc gacaggtacg ccaacgacga caagcacata tgacttcaca 780 
gacgctggtc aaaccgtttc ctttggcact gatgctgcaa cagccggtat cagcactggt 840 
gcttctctcg ttaaacttca ggatgagaaa ggcaatgata ctgctactta tgcaatcaaa 900 
gcacaagatg gcagcctgta tgccgccaac gttgatgagg ctaccggtaa agtcactgtc 960 
aaaaccgcca gctatactga tgctgacggc aaagcagtga ccgatgccgc tgtaaaactg 1020 
ggtggtgaca atggcacaac cgaaattgtt gtcgatgctg cgtcaggtaa aacttacgat 1080 
gctggtgcac tgcaaaacgt tgatctctcc agtgcaacca acacggtaac cgcaatcccg 1140 
aacggtaaaa ccacgtctcc gctggctgcc cttgacgacg caatcagcca gatcgacaaa 1200 
ttccgctcct ccctcggtgc ggtgcagaac cgtctggatt ccgcggtcac caacctgaac 1260 
aacaccacta ccaacctgtc tgaagcgcag tcccgtattc aggacgctga ctatgcgacc 13 20 
gaagtatcca acatgtcgaa agcgcagatc atccagcagg caggtaactc cgtgctgtcc 13 80 
aaa 13 83 



<210> 24 
<211> 1197 
<212> DNA 

<213> Escherichia coli 



<400> 24 

gcgctgtcga cttctatcga gcgcctctct 
gacgctgcgg gccaagcgat tgctaaccgc 
gccgcacgta acgccaacga cggtatttct 
gaaatcaaca acaacttgca gcgtgttcgt 
aactctgatt ctgacctgtc ttcaatacag 
gaccgcgtat ccggtcagac tcagttcaac 
atgaaaattc aggttggtgc gaatgatggt 
gattcttcaa ctttggggct gaatggcttc 
aatgctatca catctatccc gcaagccgct 
gatactgatg agtctgcagc aatcgcagcc 



tctggtctgc gcattaacag cgctaaagat 60 
ttcacttcta acatcaaagg tctgactcag 12 0 
ctggcgcaga ccactgaagg cgcactgtct 180 
gaactgaccg ttcaggccac taccggtact 240 
gacgaaatca aatcccgtct cgatgaaatt 3 00 
ggcgttaatg ttctttccaa agatggttca 3 60 
caaactatct ccatcgatct gaagaaaatt 42 0 
tcagtttcta aaaactctct taatgtcagc 480 
agcaatgaac ctgttgatgt taacttcggt 540 
aaattggggg tttccgatac gtcaagcctg 600 
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tcgctgcaca acatccttga taaagatggt aaggcaacag ctgattatgt tgttcagtca 660 

ggtaaagact tctatgctgc ttctgttaat gccgcttcag gtaaagtaac cttaaacacc 720 

attgatgtta cttatgatga ttatgcgaac ggtgttgacg atgccaagca aacaggtcag 780 

ctgatcaaag tttcagcaga taaagacggc gcagctcaag gttttgtcac acttcaaggc 840 

aaaaactatt ctgctggtga tgcggcagac attcttaaga atggagcaac agctcttaag 900 

ttaactgatc tgaatttaag tgatgttact gatactaatg gtaaggtaac cacaactgcg 960 

actgagcaat ttgaaggtgc ttcaactgag gatccgctgg cgcttctgga taaagctatt 1020 

gcatcagtcg acaaattccg gtcttctcta ggtgccgtgc agaaccgtct cgattccgct 1080 

atcaccaacc tgaacaacac caccaccaac ctgtctgaag cgcagtcccg tattcaggac 1140 

gccgactatg cgaccgaagt gtccaacatg tcgaaagcgc agatcatcca gcaggca 1197 



<210> 25 
<211> 1674 
<212> DNA 

<213> Escherichia coli 



<400> 25 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
acagggacta actccgattc tgacctggac 
gacgaaattg accgcgtatc tggtcagacc 
gatggctcga tgaaaattca ggtcggcgcg 
aagaaaattg actctgatac gctaaatctg 
gttgataatg ccaaggcgac tggcaaagat 
gctgatgcta atggcaaaat cacttatacc 
acagcggctg atgtattggg caaagcggct 
gatactggct taggagtcgc tgctgatgcc 
tcttacactt ttgatgctac tggtgttgcc 
tacttaggcg catctaacac cggtaaaatt 
attgccaaag atggctccat caccgatacc 
ggcaacttaa ccaaaaatac cgcgaatttg 
ctgtttgctg gtgctcagga tgcaacgatc 
gatcaaactg ctggtaccgt tgatttcaaa 
tcaaccttaa ataatggttc ctatacagcc 
gctggcgcag ttcagacagg tggcgcagat 
actgaagatg acgaaaccgt taccgcgacc 
gacggtgaag gttctactgt ctataaagct 
accaagtctg aagcaaccac tgaccctctg 
gacaaattcc gctcctccct cggtgccgtt 
ctgaacaaca ccactaccaa cctgtctgaa 
gcgaccgaag tgtccaacat gtcgaaagcg 
ctggcaaaag ccaaccaggt accgcagcag 

<210> 26 
<211> 1365 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcacagac cactgaaggc 240 
cgtattcgtg aactgacggt tcaggccact 300 
tccatccagg acgaaatcaa atctcgtctg 3 60 
cagttcaacg gcgtgaacgt gctgtctaaa 42 0 
aacgatggcg aaacgattac tattgatctg 480 
gctggtttta acgtgaatgg tgctggctct 540 
cttactgatg ctggttttac ggcaagcgca 600 
aaagacaccg ttactaaatt cgacaaagcg 660 
gctggcgata gcattaccta tgcgggcact 72 0 
tcgacttaca cctacaatgc agccaataag 780 
aaggcggatg ctggaacggc actgaaaggg 840 
aatatcggtg gtaccgagca agaagttaac 900 
aatggcgatg cgctgtatct cgatagtacc 960 
ggggctgctg ataaagcaac tgtagataaa 1020 
accttcgata gcggcatgac agctaaattc 1080 
ggcgcgtcta tttctgctga tgcaatggca 1140 
aacgtaggtg gtaaggctta tgccgtaacc 12 00 
gtgtataaag ataccactgg cgcactgacg 1260 
tactacggtt ttgctgatgg taaagtttct 1320 
gctgatggtt ccatcactaa agatgcgact 13 80 
aaagcccttg acgacgcaat cagccagatc 1440 
caaaaccgtc tggattccgc cgtcaccaac 1500 
gcgcagtccc gtattcagga cgccgactat 1560 
cagatcattc agcaggccgg taactccgtg 1620 
gttctgtctc tgctgcaggg ttaa 167 4 
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<212> DNA 

<213> Escherichia coli 
<400> 26 

aacaaatctc agtcttctct tagctctgct attgagcgtc tctcttctgg cctgcgtatt 60 
aacagtgcta aagatgacgc agcaggtcag gcgattgcta accgttttac ggcaaatatt 120 
aaaggtctga ctcaggcttc ccgtaacgcg aatgatggta tttctgttgc gcagactact 180 
gaaggtgcgc tgaatgaaat taacaacaac ctgcagcgtg tacgtgaact gactgttcag 240 
gcaactaacg gtactaactc tgacagcgat ctttcttcta ttcaggcaga aattactcaa 3 00 
cgtctggaag aaattgaccg tgtatctgag caaactcagt ttaacggcgt gaaagtcctt 3 60 
gccgaaaata atgaaatgaa aattcaggtt ggtgctaatg atggggaaac catcactatc 420 
aatctggcaa aaattgatgc gaaaactctc ggcctggacg gctttaatat cgatggcgcg 480 
cagaaagcaa ctggcagtga cctgatttct aaatttaaag cgacaggtac tgataattat 540 
caaattaacg gtactgataa ctatactgtt aatgtagata gtggagcagt tcaaaatgag 600 
gatggtgacg caatttttgt tagcgctacc gatggttctc tgactactaa gagtgataca 660 
aaagtcggtg gtacaggtat tgatgcgact gggcttgcaa aagccgcagt ttctttagct 720 
aaagatgcct caattaaata ccaaggtatt actttcacca acaaaggcac tgatgcattt 780 
gatggcagtg gtaacggcac tctaaccgct aatattgatg gcaaagatgt aacctttact 840 
attgatgcga cagggaagga cgcaacatta aaaacgtctg atcctgttta caaaaatagt 900 
gcaggtcagt tcactacaac taaggttgaa aacaaagccg ctacagcatc ggatctggac 960 
ttaaataacg ctaaaaaagt gggtagttct ttagttgtaa atggcgctga ttatgaagtt 102 0 
agcgctgatg gtaagacagt aactgggctt ggcaaaacta tgtatctgag caaatcagaa 1080 
ggtggtagcc cgattctggt aaaagaagat gcagcaaaat cgttgcaatc tactaccaac 1140 
ccgctcgaaa ccatcgacaa ggcattggct aaagttgaca atctgcgttc tgacctcggt 1200 
gcagtacaaa accgtttcga ctctgctatc accaaccttg gcaacaccgt aaacaacctg 1260 
tcttctgccc gtagccgtat cgaagatgct gactacgcga ccgaagtgtc taacatgtct 1320 
cgtgcgcaga tcctgcaaca agcgggtacc tctgttctgg cgcag 13 65 

<210> 27 
<211> 1740 
<212> DNA 

<213> Escherichia coli 
<400> 27 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgat ggtatttctg ttgcacagac cactgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtatccgtg aactgacggt tcaggcttct 3 00 
accgggacta actccgattc ggatctggac tccattcagg acgaaatcaa atcccgtctg 3 60 
gacgaaattg accgcgtatc tggccagacc cagttcaacg gcgtgaacgt actggcgaaa 42 0 
gacggttcaa tgaaaattca ggttggtgcg aatgacggcc agactatcac gattgatctg 480 
aagaaaattg actctgatac gctggggctg agtgggttta atgtgaatgg tagcggggct 540 
gtggctaata ctgcagcgac taaatctgat ttggcagcag ctcaactctt ggctccaggt 600 
actgctgatg ctaatggtac agttacctat actgttggcg caggcctgaa aacatctaca 660 
gctgcagatg taattgcgag tttggctaat aacgcaaaag ttaatgccac aattgcaaat 720 
ggttttggat cgccaacagc tacagattat acatacaaca gcgctacagg cgattttaca 7 80 
tatagtgcaa ctattgcagc tggtacaaat tctggtgata gtaacagtgc tcagttacaa 840 
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tccttcctga caccaaaagc gggcgatact gctaacttaa acgttaaaat tggttctacg 900 

tcaattgacg ttgtattggc tagcgacggt aaaattaccg cgaaagatgg ttcagaacta 960 

tttattgacg tagatggtaa cctcactcaa aacaatgctg ggactgtcaa agcagccact 1020 

cttgatgcac tgactaaaaa ctggcataca acaggcacac cgagtgccgt atctacggta 1080 

attacaactg aagatgaaac aaccttcact ctggctggcg gtactgatgc tactacttct 1140 

ggtgcaatca ctgtagcaaa tgcaagaatg agtgctgagt ctcttcaatc ggcaactaag 1200 

tccacaggat tcacagttga tgttggagct actggtacca gcgcaggcga tattaaagtt 1260 

gatagtaaag gtatagtaca acaacacaca ggtacaggtt ttgaagacgc ttacaccaaa 1320 

gctgatggtt cactgactac cgataataca accaatctgt ttttgcaaaa agacggaact 13 80 

gtgaccaatg gttcaggtaa agcagtctat gtttcagcgg atggtaattt tactactgac 1440 

gctgaaacta aagctgcaac caccgccgat ccactgaaag ctctggacga agcgatcagc 1500 

tccatcgaca aattccgttc ttccctcggt gcggtgcaaa accgtctgga ttccgcagtc 1560 

accaacctga acaacaccac tactaacctg tctgaagcgc agtcccgtat tcaggacgct 162 0 

gactatgcga ccgaagtgtc caatatgtcg aaagcgcaga tcatccagca ggccggtaac 1680 

tccgtgctgg caaaagctaa ccaggtaccg cagcaggttc tgtctctgct gcagggttaa 1740 



<210> 28 
<211> 1233 
<212> DNA 

<213> Escherichia coli 
<400> 28 

aacaaaaacc agtctgcgct gtcgacttct 
aacagcgcta aagatgacgc tgcgggccag 
aaaggtctga ctcaggccgc acgtaacgcc 
gaaggcgcac tgtctgaaat caacaacaac 
gccactaccg gtactaactc tgattctgac 
cgtctcgatg aaattgaccg cgtatccggt 
gcaaaagata acaccatgaa gattcaggtt 
gacctgcaaa aaatcgactc ttctactctt 
gctctcgaaa ctagcgaagc gatcactcag 
gtgaagatgg atgcgtctgt tctgaccgat 
ctgcacaacg taactaaagg tggtgtcgca 
aagagctatg cagcatctgt tgatgcggga 
acatataacg acgcagcaaa tggtgttacg 
gttggtgctg .atgcaaacaa tgatgcagtt 
gttgctaatg actcattagt caatgctaat 
acaattgatg gtgatggtag ccttggagct 
ggtgctactg ctgcaacatc agagttcgct 
ctggacaaag ctatcgcatc tgttgataaa 
cgtctgagct ccgctgtaac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
atccagcagg caggtaactc cgtgctgtcc 



atcgagcgcc tctcttctgg tctgcgcatt 60 
gcgattgcta accgcttcac ttctaacatc 12 0 
aacgacggta tctctctggc gcagaccact 180 
ttgcagcgtg ttcgtgagct gaccgttcag 240 
ctgtcttcaa tccaggacga aatcaaatcc 300 
cagactcagt tcaacggcgt gaacgtactg 360 
ggtgcgaacg atggtcagac tatatccatc 420 
ggtttgaacg gtttctccgt ttctaaaaat 48 0 
ttgccgaacg gtgcgaatgc accaatcgct 540 
cttaacatta ctgatgcttc cgctgtttcg 600 
acgtctactt atgttgttca gtatggcgat 660 
ggtacagtaa aactgaataa agccgacgta 720 
aatgccaccc agattggtag tctggttcag 780 
ggttttgtta ccgtgcaggg gaaaaactat 840 
ggcgctgctg gcgctgcagc aactagagtt 900 
aaccaggcta aaattgaact tagccaaaat 960 
ggtgcttcaa ccaacgatcc actgactctg 1020 
ttccgttctt ctttgggggc ggtacagaac 1080 
aacaccacta ccaacctgtc tgaagcgcag 1140 
gaagtgtcca acatgtcgaa agcgcagatc 1200 
aaa 123 3 



<210> 29 
<211> 1713 
<212> DNA 

<213> Escherichia coli 
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<400> 29 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcacagac cactgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtattcgtg aactgacggt tcaggcgacg 3 00 
accggaacta actccacctc tgacctggac tccattcagg acgaaatcaa atcccgtctt 3 60 
gatgaaattg accgcgtatc cggccaaacc cagttcaacg gcgtgaacgt actgtcaaaa 42 0 
gatggctcga tgaaaattca ggtcggcgca aatgatggtg aaaccatcac gattgatctg 480 
aaaaagatcg actcttctac attgaagctg accagcttca atgttaacgg taaaggcgct 540 
gttgataatg ctaaagccac tgaagcagat ctgaccgctg cgggcttctc ccaaggtgca 600 
gtcgtcagtg gcaacagcac ctggactaaa tctactgtta ctacctttaa tgcagcaaca 660 
gctaccgacg tgctggcaag cgttagcggc ggcagcacta ttagcggtta taccggtaca 720 
aacaatggat taggcgtagc ggcttctact gcatatacct acaacgcaac cagcaagtct 780 
tattcatttg acgcaaccgc acttaccaat ggcgatggta ctggggccac cactaaagtt 840 
gctgatgtgc tgaaagccta tgcagcaaac ggtgataata cggctcagat ctccatcggc 900 
ggaagcgctc aggacgttaa aattgccagc gatggcaccc tgactgacgt caatggtgat 9 60 
gctttatata ttggttctga cggcaacctg actaaaaacc aggccggcgg tccagatgcg 1020 
gcaacgttgg acggtatttt caacggtgcg aatggtaatg cagcagttga tgcgaagatt 1080 
acattcggca gcggcatgac cgttgatttc acccaggcta gcaaaaaagt ggatattaag 1140 
ggcgcaacgg tatccgccga agatatggac actgcgttaa ctgggcaggc ttataccgta 1200 
gctaacggcg cacagtcttt tgacgttgcc gctggtgggg cagtaaccgc tactacaggt 1260 
ggcgctaccg taaatattgg tgctgatggt gaactgacga ctgcgaccaa caagactgtc 1320 
acagaaactt atcacgaatt tgctaacggc aatattctgg atgatgacgg cgcggctctg 1380 
tacaaagcgg ctgacggttc tctgaccact gaagctactg gtaaatccga agtgaccacg 1440 
gatccgctga aagcgctgga cgatgctatc gcatccgtag acaaattccg ctcctccctc 1500 
ggtgcggtgc agaaccgtct ggattccgca gtcaccaacc tgaacaacac cactaccaac 1560 
ctgtctgaag cgcagtcccg cattcaggac gccgactatg cgaccgaagt gtccaatatg 1620 
tcgaaagcgc agatcatcca gcaggccggt aactccgtgc tggcaaaagc caaccaggta 1680 
ccgcagcagg ttctgtctct gctgcagggt taa 1713 

<210> 30 
<211> 1668 
<212> DNA 

<213> Escherichia coli 
<400> 30 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 

aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 12 0 

gctaaggatg acgccgcggg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 

ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcgcagac cactgaaggc 240 

gcgctgtccg aaatcaacaa caacttacag cgtatccgtg aactgacggt tcaggcttct 3 00 

accgggacta actccgattc ggatctggac tccattcagg acgaaatcaa atcccgtctg 3 60 

gacgaaattg accgcgtatc tggccagacc cagttcaacg gcgtgaacgt actggcgaaa 42 0 

gacggttcaa tgaaaattca ggttggtgcg aatgacggcc agactatcac tattgatctg 480 
aagaaaattg actcagatac gctggggctg agtgggttta atgtgaatgg tggcggggct 540 

gttgctaata ctgcagcgac taaagatgat ttggtcgctg catcagtttc agctgcggta 600 
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ggtaatgaat acactgtctc tgctggcctg tcgaaatcaa ctgctgctga tgttattgct 660 
agtctcacag atggtgcgac agtaactgcg gctggtgtaa gcaatggttt tgctgcaggg 720 
gcaactggag atgcttataa attcaatcaa gcaaacaaca cttttactta caataccacc 780 
tcaacagcgg cagaactcca atcttacctc acgcctaagg cgggggatac cgcaactttc 840 
tccgttgaaa ttggtggcac caagcaggat gttgttctgg ctagtgatgg caaaatcaca 900 
gcaaaagacg ggtctaaact ttatattgac accacaggga atttaaccca aaacggtgga 960 
ggtactttag aagaagctac cctcaatggc ttagctttca accactctgg tccagccgct 1020 
gctgtacaat ctactattac tactgcggat ggaacttcaa tagttctagc aggttctggc 1080 
gactttggaa caacaaaaac tgctggggct attaatgtca caggagcagt gatcagtgct 1140 
gatgcacttc tttccgccag taaagcgact gggtttactt ctggcactta taccgtaggt 1200 
acagatggag ttgttaaatc tggtggcaat gacgtttata acaaagctga cgggacggga 1260 
ttaactactg acaataccac aaaatattat ttacaagatg acgggtctgt aactaatggt 1320 
tctggtaaag ctgtgtatgc tgatgcaaca ggaaaactaa ctactgacgc tgaaactaaa 1380 
gccgaaacca ccgccgatcc cctgaaagct ctggacgaag cgatcagctc catcgacaaa 1440 
ttccgttctt ccctcggtgc ggtgcaaaac cgtctggatt ccgcggtcac caacctgaac 1500 
aacaccacta ccaacctgtc cgaagcgcag tcccgtattc aggacgccga ctatgcgacc 1560 
gaagtgtcca acatgtcgaa agcgcagatc atccagcagg ccggtaactc cgtgctggca 1620 
aaagctaacc aggtaccgca gcaggttctg tctctgctgc agggttaa 1668 

<210> 31 
<211> 1713 
<212> DNA 

<213> Escherichia coli 
<400> 31 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcggg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttccg ttgcgcagac caccgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtatccgtg aactgacggt tcaggccact 300 
accggtacta actccgattc tgacctggac tccatccagg acgaaatcaa atctcgtctt 360 
gatgaaattg accgcgtatc tggtcagacc cagttcaatg gcgtgaatgt gttgtccaaa 420 
gacggttcaa tgaaaattca ggtgggcgca aatgatggtg aaaccatcac gattgacctg 480 
aaaaaaatcg actcttctac actgaagctg accagcttca acgtcaacgg taaaggcgct 540 
gttgataatg caaaagccac tgaagcagat ctgaccgctg cgggcttctc ccaaagtgca 600 
gttgtcagtg gcaatagcac ctggactaaa tctactgtta ctacctttaa tgcagcaaca 660 
gctaccgatg tgctggctag cgttagtggc ggcagcacta ttagcggtta tgctggcaca 720 
aacaatgggt taggcgtagc ggcttctact gcatatacct acaacgcaac cagcaagtct 780 
tattcatttg acgcaaccgc acttactaat ggtgatggta ctgcgggctc aactaaagtt 840 
gctgatgttc tgaaagccta tgcagcaaac ggcgataaca cggctcagat ctccatcggt 900 
ggtagcgctc aggaagttaa aattgccagc gatggtaccc tgacggatac taatggcgat 960 
gctttataca ttggtgctga cggtaacctg acgaaaaacc aggccggcgg cccagccgcg 1020 
gcaacgttgg acggtatttt caacggtgcg aatggtcatg atgcagttga tgcgaagatt 1080 
accttcggca gcggcatgac cgttgacttc acccaggtta gcaacaatgt ggatattaag 1140 
ggcgcgacgg tatccgccga agatatgaac actgcgttaa ccggtcaggc ttataccgta 1200 
gctaacggcg cacagtctta tgacgttgcc gctgatggtg cagtaactgc tactacaggt 1260 
ggagcgaccg taaatattgg tgctgagggt gaactgacga ctgcggccaa caagactgtc 1320 
acagaaactt atcacgaatt tgctaacggc aatattctgg atgatgacgg cgcggctctg 1380 
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tataaagcgg ctgacggctc tctgaccact gaagctacag gtaaatctga agcgaccacg 1440 
gatccgctga aagcgctgga cgatgctatc gcatccgtag acaaattccg ttcttccctg 1500 
ggtgccgtgc agaaccgtct ggattccgca gtcaccaacc tgaacaacac cactaccaac 15 60 
ctgtccgaag cgcagtcccg tattcaggac gccgactatg cgaccgaagt gtccaacatg 162 0 
tcgaaagcgc agattattca gcaggcaggt aactccgtgc tggcaaaagc taaccaggta 16 80 
ccgcagcagg ttctgtctct gctgcagggt taa 1713 

<210> 32 
<211> 1188 
<212> DNA 

<213> Escherichia coli 



aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tctcttctgg tctgcgcatt 60 
aacagcgcta aagatgacgc tgcgggccag gcgattgcta accgcttcac ttctaacatc 12 0 
aaaggtctga ctcaggccgc acgtaacgcc aacgacggta tctctctggc gcagaccact 18 0 
gaaggcgcac tgtctgaaat caacaacaac ttgcagcgtg tgcgtgagtt gactgttcag 240 
gcgacgaccg ggactaactc tgattctgac ctgtcttcta ttcaggacga aatcaaatcc 3 00 
cgtctggatg aaattgaccg tgtttccggt cagacccagt tcaacggcgt gaacgtgctg 3 60 
gctaaaaacg gttctatggc gattcaggtt ggcgcgaatg atgggcagac catcaacatc 420 
gacctgcaga aaatcgactc ttctactctg ggcctgggcg gcttctccgt atctaacaat 480 
gcactgaaac tgagcgattc tatcactcag gttggtgcga gtggttcact ggcagatgtg 540 
aaactgagct ctgttgcctc ggctctgggt gtagacgcaa gcactctgac tctgcacaac 600 
gtacagaccc cagctggcgc agcaacagct aactatgttg tctcttctgg ttctgacaac 660 
tactcagtat ctgttgaaga tagctccggt acagttacgc tgaacaccac tgatataggt 720 
tataccgata ccgctaatgg cgttactacc ggttccatga ctggtaagta cgttaaagtt 780 
ggagctgatg cattgggtgc tgctgtaggt tatgtcaccg tacagggaca aaacttcaaa 840 
gctgatgctg gcgcgctggt taactccaag aatgctgctg gtagtcagaa tgttacttct 9 00 
gcaattggcg atattgctaa taaagcgaat gctaacattt acactggaac ctcttctgca 960 
gatccactgg ctctgctgga caaagctatc gcatctgttg ataaattccg ttcttctcta 1020 
ggggcggtgc agaaccgtct gagctctgct gtaaccaacc tgaacaacac cactaccaac 1080 
ctgtccgaag cgcagtcccg tattcaggac gccgactatg cgaccgaagt gtccaacatg 1140 
tcgaaagcgc agatcatcca gcaggcgggt aactccgtgc tgtctaaa 1188 

<210> 33 
<211> 1638 
<212> DNA 

<213> Escherichia coli 



atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 

aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 

gcgaaggatg acgccgccgg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 

ctgactcagg ctgcacgtaa cgccaatgac ggtatttctg ttgcacagac cactgaaggc 240 

gcgctgtccg aaatcaacaa caacttacag cgtattcgtg aactgacggt tcaggcttct 3 00 

accgggacta actctgattc ggatctggac tccattcagg acgaaatcaa atcccgtctc 3 60 

gacgaaattg accgcgtatc cggtcagacc cagttcaacg gcgtgaacgt actggcaaaa 42 0 
gacggttcga tgaaaattca ggttggtgcg aacgacggcc agactatcac tattgatctg 480 



<400> 32 



<400> 33 
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aagaaaattg actctgatac gctggggctg agtgggttta acgtaaatgg tagcgcagat 540 
aaggcaagtg tcgcggcgac agctgacgga atggttaaag acggatatat caaagggtta 600 
acttcatctg acggcagcac tgcatatact aaaactacag caaatactgc agcaaaagga 660 
tctgatattc ttgcggcgct taagactggc gataaaatta ccgcaacagg tgcaaatagc 720 
cttgctgata atgcgacatc gacaacttat acttataatg caaccagcaa taccttctcc 780 
tatacggctg acggtgtaaa ccaaacgaat gctgcagcaa atctcatacc tgcagcaggg 840 
aaaacgacag ctgcatcagt tactattggt gggacagcac agaatgtaaa tattgatgat 900 
tcgggcaata ttacttcaag tgatggcgat caactttatc tggattcaac aggtaacctg 960 
actaaaaacc aggccggcaa cccgaaaaaa gcaaccgttt ctgggcttct cggaaatacg 1020 
gatgcgaaag gtactgctgt taaaacaacc atcaagacag aggctggtgt aacagttaca 1080 
gctgaaggta atacaggtac tgtaaaaatt gaaggtgcta ctgtttcagc atctgcattt 1140 
acgggcattg catattccgc caacaccggt gggaatactt atgctgttgc cgcaaataat 1200 
actacaaatg gtttcctggc gggggatgac ttaacccagg atgctcaaac tgtttcaacc 1260 
tactactcgc aagccgatgg cacggtcacg aatagcgcag gcaaagaaat ctataaagac 1320 
gctgatggtg tctacagcac agagaataaa acatcgaaga cgtccgatcc attggctgcg 13 80 
cttgacgacg caatcagctc catcgacaaa ttccgttcat ccttgggtgc tatccagaac 1440 
cgtctggatt ccgcggtcac caacctgaac aacaccacta ccaacctgtc cgaagcgcag 1500 
tcccgtattc aggacgccga ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatc 1560 
atccagcagg ccggtaactc cgtgctggca aaagctaacc aggtaccgca gcaggttctg 1620 
tctctgctgc agggctaa 163 8 



<210> 34 
<211> 2145 
<212> DNA 

<213> Escherichia coli 



<400> 34 

aacaaatctc agtcttctct gagctccgcc attgaacgtc tctcttctgg cctgcgtatt 60 
aacagtgcta aagatgacgc agcaggtcag gcgattgcta accgttttac agcaaatatt 120 
aaaggtctga ctcaggcttc ccgtaacgcg aatgatggta tttctgttgc gcagaccact 180 
gaaggtgcgc tgaatgaaat taacaacaac ctgcagcgtg tacgtgaact gactgttcag 240 
gcaactaacg gtactaactc tgacagcgat ctttcttcta tccaggctga aattactcaa 300 
cgtctggaag aaattgaccg tgtatctgag caaactcagt ttaacggcgt gaaagtcctt 360 
gctgaaaata atgaaatgaa aattcaggtt ggtgctaatg atggtgaaac catcactatc 420 
aatctggcaa aaattgatgc gaaaactctc ggcctggacg gttttaatat cgatggcgcg 480 
cagaaagcaa ctggcagtga cctgatttct aaatttaaag cgacaggtac tgataactat 540 
gatgttggcg gtgatgctta tactgttaac gtagatagcg gagctgggta atgactccaa 600 
cttattgata gtgttttatg ttcagataat gcccgatgac tttgtcatgc agctccaccg 660 
attttgagaa cgacagcgac ttccgtccca gccgtgccag gtgctgcctc agattcaggt 720 
tatgccgctc aattcgctgc gtatatcgct tgctgattac gtgcagcttt cccttcaggc 780 
gggattcata cagcggccag ccatccgtca tccatatcac cacgtcaaag ggtgacagca 840 
ggctcataag acgccjccagc gtcgccatag tgcgttcacc gaatacgtgc gcaacaaccg 900 
tcttccggag cctgtcatac gcgtaaaaca gccagcgctg gcgcgattta gccccgacat 960 
agtcccactg ttcgtccatt tccgcgcaga cgatgacgtc actgcccggc tgtatgcgcg 1020 
aggttaccga ctgcggcctg agttttttaa gtgacgtaaa atcgtgttga ggccaacgcc 1080 
cataatgcgg gcagttgccc ggcatccaac gccattcatg gccatatcaa tgattttctg 1140 
gtgcgtaccg ggttgagaag cggtgtaagt gaactgcagt tgccatgttt tacggcagtg 12 00 
agagcagaga tagcgctgat gtccggcggt gcttttgccg ttacgcacca ccccgtcagt 1260 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 



wo 99/61458 PCT/AU99/00385 

- 23 - 

agctgaacag gagggacagc tgatagaaac agaagccact ggagcacctc aaaaacacca 1320 
tcatacacta aatcagtaag ttggcagcat taccgcggag ctgttaaaga tactacaggg 1380 
aatgatattt ttgttagtgc agcagatggt tcactgacaa ctaaatctga cacaaacata 1440 
gctggtacag ggattgatgc tacagcactc gcagcagcgg ctaagaataa agcacagaat 1500 
gataaattca cgtttaatgg agttgaattc acaacaacaa ctgcagcgga tggcaatggg 1560 
aatggtgtat attctgcaga aattgatggt aagtcagtga catttactgt gacagatgct 1620 
gacaaaaaag cttctttgat tacgagtgag acagtttaca aaaatagcgc tggcctttat 1680 
acgacaacca aagttgataa caaggctgcc acactttccg atcttgatct caatgcagct 1740 
aagaaaacag gaagcacgtt agttgttaac ggtgcaactt acgatgttag tgcagatggt 1800 
aaaacgataa cggagactgc ttctggtaac aataaagtca tgtatctgag caaatcagaa 1860 
ggtggtagcc cgattctggt aaacgaagat gcagcaaaat cgttgcaatc taccaccaac 1920 
ccgctcgaaa ctatcgacaa agcattggct aaagttgaca atctgcgttc tgacctcggt 1980 
gcagtacaaa accgtttcga ctctgctatc accaaccttg gcaacaccgt aaacaacctg 2040 
tcttctgccc gtagccgtat cgaagatgct gactacgcga ccgaagtgtc taacatgtct 2100 
cgtgcgcaga tcctgcaaca agcgggtacc tctgttctgg cgcag 2145 



<210> 35 
<211> 1587 
<212> DNA 

<213> Escherichia coli 



<400> 35 

aacaagaacc agtctgcgct gtcgagttct 
aacagcgcga aggatgacgc cgcaggtcag 
aaaggcctga ctcaggctgc acgtaacgcc 
gaaggcgcgc tgtccgaaat caacaacaac 
gcaaccaccg gtaccaactc ccagtctgac 
cgtctggacg aaattgaccg cgtatccggt 
gcaaaagacg gttccatgaa aattcaggtt 
gacctgaaga agattgactc ttctacgctg 
gcagcggttg ataatgctaa agcgacggat 
ggcgttgtgg attcaaatgg taatagtact 
gcggcaactg cagtaaacgt actagcagca 
ggtactggta atggtttagg gattgctgca 
aaatcctata cctttgattc tacgggggct 
ggtacttttg gtacagatac gaatactgca 
gtaaacatcg ctaaagatgg gaaaattact 
tccactggta atttgactaa gaacggctct 
gtccttactg gtgctaattc agttgatgat 
gtcacccttg ataaagtgaa cagcactgta 
gcaatgacta atgagttgac aggtaaggcc 
gctgtagcta ctaataacac agtaaaaacg 
gctagtggta aattaactac tgatgacaaa 
gcgaatggca atatctatga tgataaaggc 
ctgactacag aaactacaag taaatcagaa 
gacgcaatca gccagatcga caaattccgt 
gattccgcag tcaccaacct gaacaacacc 
attcaggacg ccgactatgc gaccgaagtg 



atcgagcgtc tgtcttctgg cttgcgtatt 60 
gcgattgcta accgttttac ttctaacatt 120 
aacgacggta tttctgttgc gcagaccacc 180 
ttacagcgtg tgcgtgaact gaccgttcag 240 
ctggactcta tccaggacga aattaaatcc 300 
cagacccagt tcaacggcgt gaacgtactg 3 60 
ggcgcgaacg atggccagac catcactatc 42 0 
aaactgactg gttttaacgt gaatggcaaa 480 
gcaaatctga ctaccgccgg ttttacacaa 540 
tggactaaat caactacgac taatttcgat 600 
gttaaagatg gcagcacaat caattacacc 660 
acaagtgctt atacatatca cgatagcact 720 
gcagtagctg gtgccgcgtc cagcctgcaa 780 
aaaatcacca tcgatggttc tgctcaagaa 840 
gatactgatg gtaaagcttt atatatcgat 900 
gatactttaa ctcaggcaac attgaatgat 960 
acaaggattg acttcgatag cggcatgtct 1020 
gatatcactg gcgcatctat ttcagccgct 1080 
tataccgtag taaatggtgc agaatcttac 1140 
actgctgatg ctaaaaatgt ttatgttgat 1200 
gccactgtta cagaaactta tcatgaattt 1260 
gctgctgttt atgcggcggc ggatggttct 1320 
gctacagcta acccgctggc cgctctggac 13 80 
tcatccctgg gtgctatcca gaaccgtctg 1440 
actaccaatc tgtctgaagc gcagtcccgt 1500 
tccaatatgt cgaaagcgca gatcatccag 1560 
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caggcaggca actccgtgct ggcaaaa 



1587 



<210> 36 
<211> 1245 
<212> DNA 

<213> Escherichia coli 
<400> 36 

aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tctcttctgg tctgcgcatt 60 
aacagcgcta aagatgacgc tgcgggccag gcgattgcta accgcttcac ttctaacatc 120 
aaaggtctga ctcaggccgc acgtaacgcc aacgacggta tctctctggc gcagaccact 180 
gaaggcgcac tgtctgaaat caacaacaac ttgcagcgtg ttcgtgaact gaccgttcag 240 
gccactaccg gtactaactc tgattctgac ctgtcttcaa tccaggacga aatcaaatcc 3 00 
cgtctcgatg aaattgaccg cgtatccggt cagactcagt tcaacggcgt gaacgtactg 360 
gcaaaagatg gctcgatgaa aattcaggtc ggtgcaaatg atggtcagac aatcagcatt 420 
gatttgcaga agattgattc ttctacttta gggttaaatg gtttttctgt ttccaaaaat 480 
gcagtatctg ttggtgatgc tattactcaa ttgcctggcg agacggcagc cgatgcacca 540 
gtaaccatca agtttgatga ttcagtaaaa actgatttaa aactgaccga tgcttcaggg 600 
ttaagtctgc ataacctcaa agatgaaaat ggtaatttaa ctaaccagta tgttgtacag 660 
aatggcggaa aatcttacgc tgctacagtc gctgccaatg gtaatgttac gctgaacaaa 72 0 
gcaaatgtaa cctacagcga tgtcgcaaac ggtattgata ccgcaacgca gtcaggccag 780 
ttagttcagg ttggtgcaga ttctaccggt acgccaaaag cattcgtgtc tgtccaaggt 84 0 
aaaagctttg gcattgatga cgccgccttg aagaataaca ctggtgatgc taccgctact 900 
caaccgggaa catctgggac aacagttgtc gcagcgtcaa ttcatctgag tacgggcaaa 96 0 
aactctgtag acgctgatgt aacggcttcc actgaattca caggtgcttc aaccaacgat 1020 
ccactgactc tgctggacaa agctatcgca tctgttgata aattccgttc ttctttgggg 1080 
gcggtacaga accgtctgag ctccgctgta accaacctga acaacaccac caccaacctg 1140 
tctgaagcgc agtcccgtat tcaggacgcc gactatgcga ccgaagtgtc caacatgtcg 1200 
aaagcgcaga ttatccagca ggcaggtaac tccgtgctgt ccaaa 1245 

<210> 37 

<211> 1185 

<212> DNA 

<213> Escherichia coli 



aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tctcttctgg tctgcgcatt 60 
aacagcgcta aagatgacgc tgcgggccag gcgattgcta accgcttcac ttctaacatc 12 0 
aaaggtctga ctcaggctgc acgtaacgcc aatgacggta tttctctagc acagacagcg 180 
gaaggcgcgc tgtcagagat taacaacaac ttgcagcgtg tgcgtgagtt gaccgtgcag 240 
gcaaccactg gtaccaactc tgattccgat ctctcttcta ttcaggatga aattaaatct 3 00 
cgtctggatg aaattgaccg cgtctctggt cagacccagt ttaacggcgt gaacgtactg 3 60 
gctaaaaacg gttctatggc aattcaggtt ggcgcgaacg atggccagac tatctctatc 42 0 
gacctgcaga aaatagactc ttctactctg ggtctgagcg gcttctctgt ttctcagaac 480 
tccctgaaac tgagcgattc tatcactacg atcggcaata ctactgctgc atcgaagaac 540 
gtggacctga gcgcagtagc aactaaactg ggcgtgaatg caagcaccct gagcctgcac 600 
gaagttcagg actctgctgg tgacggtact ggtaccttcg ttgtttcttc tggcagcgac 660 
aactatgctg tgtctgtaga cgcggcctct ggtgcagtta acctgaacac cactgacgtc 720 



<400> 37 
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acctatgatg acgctactaa tggtgttact ggcgcgactc agaacggtca gctgatcaaa 7 80 

gtaacttctg acgccaacgg tgcagctgtt ggttacgtaa ccattcaggg taaaaactat 840 

caggctggtg cgaccggtgt tgacgttctg gcgaacagcg gtgttgcagc tccaactaca 9 00 

gctgttgata ccggtactct gcaactgagc ggtactggtg caactactga gctgaaaggt 960 

actgcaactc agaacccact ggcactattg gacaaagcta tcgcttctgt tgataaattc 1020 

cgttcttctc tgggtgcggt acagaatcgt ctgagctctg ctgtaaccaa cctgaataac 1080 

accaccacta acctgtctga agcgcagtcc cgtattcagg atgccgacta tgcgaccgaa 1140 

gtgtcaaata tgtctaaagc gcagatcgtt cagcaggccg gtaac 1185 



<210> 38 
<211> 1383 
<212> DNA 

<213> Escherichia coli 



<400> 38 

aacaaatctc agtcttctct tagctctgct 
aacagcgcaa aagacgatgc agcaggtcag 
aaaggtctga cccaggcttc ccgtaacgca 
gaaggtgcgc tgaatgaaat taacaacaac 
gcaactaacg gtactaactc tgacagcgat 
cgtctggaag aaattgaccg tgtatctgag 
gctgaaaata atgaaatgaa aattcaggtt 
aatctggcaa aaattgatgc gaaaactctc 
cagaaagcaa caggcagtga cctgatttct 
gatgttggcg gtaaaactta taccgtgaat 
aataaagatg tttttgtaag cgcagctgat 
gtatccggtg aaagtattga tgcaacagaa 
aaaggctcca ttgaatacaa gggcattaca 
gctaatggta aaggtgtttt gaccgcaaat 
gacagtaatg cacccacggg tgccggcgca 
aacagtgcgg gccagttcac cactacaaaa 
ctggatctta atgcagccaa gaaaacaggt 
aatgtcagcg cagatggtaa aacggtaact 
tatctgagca aatcagaagg tggtagcccg 
ttgcaatcta ccaccaaccc gctcgaaact 
ctgcgttctg acctcggtgc agtacaaaac 
aacaccgtaa acaacctgtc ttctgcccgt 
gaagtgtcta acatgtctcg tgcgcagatc 
cag 



attgagcgtc 


tgtcttctgg 


tctgcgtatt 


60 


gcgattgcta 


accgttttac 


ggcaaatatt 


120 


aatgatggta 


tttctgttgc 


gcagaccact 


180 


ctgcagcgta 


ttcgtgaact 


ttctgttcag 


240 


ctttcttcta 


tccaggctga 


aattactcaa 


300 


caaactcagt 


ttaacggcgt 


gaaagtcctt 


360 


ggtgctaatg 


atggtgaaac 


catcactatc 


420 


ggcctggacg 


gttttaatat 


cgatggcgcg 


480 


aaatttaaag 


cgacaggtac 


tgataattat 


540 


gtggagagcg 


gcgcggttaa 


gaatgatgct 


600 


ggatcgctga 


cgaccagtag 


tgatactaaa 


660 


ctagcgaaac 


ttgcaataaa 


attagctgac 


720 


tttactaaca 


acactggcgc 


agagcttgat 


780 


attgatggtc 


aagatgttca 


atttactatt 


840 


acaataacta 


cagacacagc 


tgtttacaaa 


900 


gtggaaaata 


aagccgcaac 


actctctgat 


960 


agcactttag 


ttgtaaatgg 


cgccacctac 


1020 


gatactactc 


ctggtgcccc 


taaagtgatg 


1080 


attctggtaa 


acgaagatgc 


agcaaaatcg 


1140 


atcgacaagg 


cattggctaa 


agttgacaat 


1200 


cgtttcgact 


ctgccatcac 


caaccttggc 


1260 


agccgtatcg 


aagatgctga 


ctacgcgacc 


1320 


ctgcaacaag 


cgggtacctc 


tgttctggcg 


1380 
1383 



<210> 39 
<211> 1680 
<212> DNA 

<213> Escherichia coli 



<400> 39 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
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gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt tcacctctaa cattaaaggc 180 

ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcacagac caccgaaggc 240 

gcgctgtccg aaatcaacaa caacttacag cgtatccgtg aactgacggt tcaggcttct 3 00 

accgggacta actctgattc ggatctggac tccattcagg acgaaatcaa atcccgtctg 3 60 

gacgaaattg accgcgtatc cggccagacc cagttcaacg gcgtgaacgt gctggcgaaa 42 0 

gacggttcaa tgaaaattca ggttggtgcg aatgacggcc agactatcac tattgatctg 480 

aagaaaattg actctgatac tctgggtttg agtggattta atgtgaatgg caaaggggct 54 0 

gtggctaacg caaaagcgac cgaagcagat ttaacggggg ctggtttctc tcaaggagcg 600 

gtggatacaa acggaaatag tacttggaca aaatcaacca ccaccaatta ctcagctgca 660 

acaactgctg acttgttatc gaccattaag gatggctcta ctgttacata tgcagggaca 72 0 

gacaccggat taggggtcgc agcagcagga aattatactt atgatgcgaa cagtaaatct 7 80 

tattccttca atgccaatgg tctgacgggc gcaaataccg caactgcact caaaggttac 84 0 

ttggggacag gtgctaacac cgctaaaatt tctatcggtg gtacagagca ggaagtgaat 900 

attgccaaag atggcactat tacagatacg aatggtgatg cgctctatct ggatattacc 9 60 

ggcaacctga ctaagaacta tgcgggttca ccacctgcag caacgctgga taacgtatta 1020 

gcttccgcaa ctgtaaatgc cactatcaag tttgatagcg gtatgacggt tgattacact 108 0 

gcaggtactg gcgcgaatat tacaggtgca tccatttctg cagatgacat ggccgcaaaa 1140 

ctgagcggaa aggcgtacac tgttgccaat ggtgctgagt cttatgacgt tgctgcagtt 1200 

acgggggctg taacaactac agcaggtaat tcacctgtgt atgccgatgc agacggtaaa 12 60 

ttaacgacga gtgccagtaa tacggttact cagacttatc acgagtttgc taatggtaac 132 0 

atttatgatg acaaaggctc gtcactgtat aaagctgcag atggctctct gacttctgaa 13 80 

gctaaaggga aatctgaagc aaccgccgat cccctgaaag ctctggacga agccatcagc 1440 

tccatcgaca aattccgctc ctccctcggt gccgttcaaa accgtctgga ttctgcggtg 1500 

accaacctga acaacaccac taccaacctg tctgaagcgc agtcccgtat tcaggacgcc 1560 

gactatgcga ccgaagtgtc caatatgtcg aaagcgcaga tcatccagca ggccggtaac 1620 

tccgtgttgg caaaagctaa ccaggtaccg cagcaggttc tgtctctgct gcagggttaa 1680 

<210> 40 
<211> 1146 
<212> DNA 

<213> Escherichia coli 
<400> 40 

gcgctgtcga cttctatcga gcgcctctct tctggtttgc gcattaacag cgctaaagat 60 
gacgctgcgg gccaggcgat tgctaaccgc ttcacttcta acatcaaagg tctgactcag 12 0 
gccgcacgta acgccaacga cggtatctct ctggcgcaga ccactgaagg cgcactgtct 180 
gaaatcaaca acaacttgca gcgtgttcgt gaactgaccg ttcaggccac taccggtact 240 
aactctgatt ctgacctgtc ttcaatccag gacgaaatca aatcccgctt ggctgaaatc 3 00 
gatcgtgtct ctggtcagac ccagttcaac ggcgtgaacg tgctggctaa aaacggttct 3 60 
ctgaatattc aggttggcgc gaatgatggg cagaccatct ctatcgattt gcagaaaata 420 
gactcttctg cccttggttt aagtggtttt agtgttgccg gtggggcgct aaaattaagc 480 
gatacagtga cgcaggtcgg cgatggttca gccgcgccag ttaaagtgga tctggatgca 540 
gcagcaacag atattggtac tgctttgggg caaaaggtta atgcaagttc tttaacgttg 600 
cacaatatct tagacaaaga tggtgcggca actgagaact atgttgttag ctatggtagt 660 
gataattacg ctgcatctgt tgcagatgac gggactgtaa ctcttaataa aacggatatt 720 
acttattcag gcggtgatat taccggcgct accaaagatg atacgttgat taaagttgct 780 
gctaattctg acggagaggc cgttggtttc gctaccgttc agggtaagaa ttatgaaatt 840 
acagatggtg taaaaaacca gtccactgct gcaccaaccg atattgctca gaccattgat 900 
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ctggatacgg ctgatgaatt tactggggct tccactgctg atccactggc acttttagac 960 

aaagctattg cacaggttga tactttccgc tcctccctcg gtgccgttca aaaccgtctg 1020 

gattccgcag tcaccaacct gaacaacact actaccaacc tgtctgaagc gcagtcccgt 108 0 

attcaggacg ccgactatgc gaccgaagtg tccaatatgt cgaaagcgca gatcatccag 1140 
caggcc 1146 

<210> 41 
<211> 1506 
<212> DNA 

<213> Escherichia coli 
<400> 41 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 

aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 

gcgaaggatg acgcagcggg tcaggcgatt gctaaccgtt ttacttctaa tattaaaggc 180 

ctgactcagg ctgcacgtaa cgccaatgac ggtatttctc tggcgcagac cactgaaggc 24 0 

gcactgtctg aaatcaacaa caacttgcag cgtgtgcgtg aactgaccgt acaggcgaca 300 

accggaacga actccgaatc tgacctgtcc tctatccagg acgaaatcaa atcccgtctg 3 60 

gaagagattg accgcgtatc cggccagact cagttcaacg gcgtgaatgt gctggcaaaa 420 

gacggcacca tgaaaattca ggtaggcgcg aacgatggtc agactatctc tatcgatctg 480 

aaaaaaatcg actcttcaac cctgggcctg accggttttg atgtttcgac gaaagcgaat 540 

atttctacga cagcagtaac gggggcggca acgaccactt atgctgatag cgccgttgca 600 

attgatatcg gaacggatat tagcggtatt gctgctgatg ctgcgttagg aacgatcaat 660 

ttcgataata caacaggcaa gtactacgca cagattacca gtgcggccaa tccgggcctt 720 

gatggtgctt atgaaatcca tgttaatgac gcggatggtt ccttcactgt agcagcgagt 780 

gataaacaag cgggtgctgc tccgggtact gctctgacaa gcggtaaagt tcagactgca 840 

accaccacgc caggtacggc tgttgatgtc actgcggcta aaactgctct ggctgcagca 900 

ggtgctgaca cgagtggcct gaaactggtt caactgtcca acacggattc cgcaggtaaa 960 

gtgaccaacg tgggttacgg cctgcagaat gacagcggca ctatctttgc aaccgactac 1020 

gatggcacca ctgtgaccac gccgggcgca gagactgtga cttacaaaga tgcttccggt 1080 

aacagcacca ctgcggctgt cacactgggt ggctctgatg gcaaaaccaa tctggttacc 1140 

gccgctgacg gcaaaacgta cggtgcgact gcactgaatg gtgctgatct gtccgatcct 12 00 

aataacaccg ttaaatctgt tgcagacaac gctaaaccgt tggctgccct ggatgatgca 12 60 

attgcgatgg tcgacaaatt ccgctcctcc ctcggtgcgg tgcaaaaccg tctggattcc 1320 

gcagtcacca acctgaacaa caccactacc aacctgtctg aagcgcagtc ccgtattcag 1380 

gacgccgact atgcgaccga agtgtccaac atgtcgaaag cgcagattat ccagcaggca 1440 

ggtaactccg tgctgtccaa agctaaccag gttccgcagc aggttctgtc tctgctgcag 1500 
ggttaa 1506 

<210> 42 
<211> 950 
<212> DNA 

<213> Escherichia coli 
<400> 42 

aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tctcttctgg tctgcgtatt 60 

aacagcgcta aagatgacgc cgcgggccag gcgattgcta accgctttac ttctaacatc 120 

aaaggtctga ctcaggccgc acgtaacgcc aacgacggta tttctctggc gcagacggct 180 
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gaaggcgcgc tgtcagagat taacaacaac ttgcagcgta ttcgtgaact gaccgttcag 240 

gcctctaccg gcacgaactc tgattccgac ctgtcttcta ttcaggacga aatcaaatcc 3 00 

cgtcttgatg aaattgaccg tgtatctggt cagacccagt tcaacggtgt gaacgtgctg 3 60 

tcgaaaaacg attcgatgaa gattcagatt ggtgccaatg ataaccagac gatcagcatt 420 

ggcttgcaac aaatcgacag taccactttg aatctgaaag gatttaccgt gtccggcatg 480 

gcggatttca gcgcggcgaa actgacggct gctgatggta cagcaattgc tgctgcggat 540 

gtcaaggatg ctgggggtaa acaagtcaat ttactgtctt acactgacac cgcgtctaac 600 

agtactaaat atgcggtcgt tgattctgca accggtaaat acatggaagc cactgtagcc 660 

attaccggta cggcggcggc ggtaactgtt ggtgcagcgg aagtggcggg agccgctaca 72 0 

gccgatccgt taaaagcact ggatgccgca atcgctaaag tcgacaaatt ccgctcctcc 780 

ctcggtgccg ttcaaaaccg tctggattct gcggtcacca acctgaacaa caccaccacc 840 

aacctgtctg aagcgcagtc ccgtattcag gacgccgact atgcgaccga agtgtccaac 900 

atgtcgaaag cgcagattat ccagcaggcc ggtaactccg tgctggcaaa 950 



<210> 43 
<211> 1707 
<212> DNA 

<213> Escherichia coli 



<400> 43 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgcagcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
accgggacta actccgattc ggatctggac 
gacgaaattg accgcgtatc cggtcaaacc 
gacggttcga tgaaaattca ggttggtgcg 
aagaaaattg actcagatac gctggggctg 
attgcgaaca aagctgctac agtcagcgat 
ccttatgctg tgaccacaaa caatacagca 
aaaaccggag atacagttac tactactggc 
gctaaaggga acttcaccac tcaagcaaca 
aatactctga aaccagcggc tggcactact 
gatgtgaagt ttgatgtaga tgctaatggc 
ctggacgcca ctggtaacct atctacaaac 
tccgatctgt ttgctagcgg tagtacctta 
acaacttata actttggtgc agcggcaact 
gctgatactg tactgagcac agtgcagagt 
gcgacaatta agtataatac aggtattcag 
actaatggtg ctggtaattc gaatgacacc 
accgcatctt acactatcaa ctacaacgtc 
tcaaatggcg caggtgcaac tggtaaattt 
aactctacag gcaaactgac cactgaaacc 
ctggctgccc tggatgaagc tatcagctcc 
atccagaacc gtctggattc cgcggttacc 
gaagcgcagt cccgtattca ggacgccgac 
gcgcagatta tccagcaggc cggtaactcc 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacctctaa cattaaaggt 18 0 
ggtatttctg ttgcacagac cactgaaggc 240 
cgtatccgtg aactgacggt tcaggcttct 3 00 
tccattcagg acgaaatcaa atcccgtctg 3 60 
cagttcaacg gtgtgaacgt actggcgaaa 420 
aatgacggcc agactatcac gattgatctg 480 
aatggtttca acgttaatgg caaaggcact 540 
ctgaccgctg ctggtgcaac gggaacaggt 600 
ctcagcgcta gcgatgcact gtctcgcctg 660 
tcgagtgctg cgatctatac ttatgatgcg 720 
gttgcagatg gcgatgttgt taactttgcg 780 
gcatcaggtg tttatactcg tagtactggt 840 
gatgtgacca tcggtggtaa agccgcgtac 900 
aaccccggca ttgcatcttc agcgaaattg 960 
gcgacaactg gttctatcca gctgtctggc 1020 
tctggcgtaa cctacaccaa aactgtaagc 1080 
gctgcaacgg ctaacacagc agttactggt 1140 
tctgcaacgg cgtccttcgg tggtgtgaat 12 00 
tatactgatg cagacaaaga gctcaccaca 1260 
gataaggata ccggtacagt aactgtagct 132 0 
gcagctactg ttggggcaca ggcttatgtt 13 80 
accagtgcag gcactgcaac caaagatcct 1440 
atcgacaaat tccgttcatc cctgggtgct 1500 
aacctgaaca acaccactac caacctgtcc 1560 
tatgcgaccg aagtgtccaa catgtcgaaa 1620 
gtgctggcaa aagccaacca ggtaccgcag 1680 
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caggttctgt ctctgctgca gggttaa 

<210> 44 
<211> 1720 
<212> DNA 

<213> Escherichia coli 
<400> 44 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa tattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaatgac ggtatttctg ttgcacagac cactgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtgtgcgtg aactgaccgt tcaggcgacc 3 00 
accggtacca actcccagtc tgatctggac tctatccagg acgaaatcaa atcccgtctg 3 60 
gacgaaattg accgcgtatc cggtcagact cagttcaacg gcgtgaacgt actggcaaaa 420 
gacggttcca tgaaaattca ggttggcgcg aatgatggcc agaccatcac tatcgacctg 480 
aagaagattg actcttctac gttgaaactg actggtttta acgtgaatgg ttctggttct 540 
gtggcgaata ctgcggcgac taaagacgaa ctggctgctg ctgctgcggc ggcgggtaca 600 
actcctgctg tcggtactga cggcgtgacc aaatataccg tagacgcagg gcttaacaaa 660 
gccacagcag caaacgtgtt tgcaaacctt gcagatggtg ctgttgttga tgctagcatt 720 
tccaacggtt ttggtgcagc agcagccaca gactacacct acaataaagc tacaaatgat 780 
ttcactttca atgccagcat tgctgctggt gctgcggccg gtgatagtaa cagcgcagct 840 
ctgcaatcct tcctgactcc aaaagcaggt gatacagcta acctgagcgt caaaatcggt 900 
acgacatctg ttaatgttgt tctggcgagc gatggcaaaa ttacagcgaa agatggctca 960 
gctctgtata tcgactcaac gggtaacctg actcagaaca gcgcaggcac tgtaacagca 1020 
gcaaccctgg atggactgac caaaaaccat gatgcgacag gagctgttgg tgttgatatc 1080 
acgaccgcag atggcgcaac tatctctctg gcaggctctg ctaacgcggc aacaggtact 1140 
caatcaggtg caattacact gaaaaatgtt cgtatcagtg ctgatgctct gcagtctgct 1200 
gcgaaaggta ctgttatcaa tgttgataat ggtgctgatg atatttctgt tagtaaaacc 1260 
gggtgtcgtt actaccggag gtgcgcctac ttatactgat gctgatggta aattaacgac 1320 
aaccaacacc gttgattatt tcctgcaaac tgatggcagc gtaaccaatg gttctggtaa 13 80 
aggggtttac accgatgcag ctggtaaatt cactaccgac gctgcaacca aagccgcaac 1440 
caccaccgat ccgctgaaag cccttgatga cgcaatcagc cagatcgata agttccgttc 1500 
atccctgggt gctatccaga accgtctgga ttccgcggtt accaacctga acaacaccac 1560 
taccaacctg tccgaagcgc agtcccgtat tcaggacgcc gactatgcga ccgaagtgtc 1620 
caatatgtcg aaagcgcaga tcatccagca ggccggtaac tccgtgttgg caaaagctaa 1680 
ccaggtaccg cagcaggttc tgtctctgct gcagggttaa 1720 

<210> 45 
<211> 14516 
<212> DNA 

<213> Escherichia coli \ 
<400> 45 

gatctgatgg ccgtagggcg ctacgtgctt tctgctgata 
actgctccag gtgcctgggg acgtattcaa ctgactgatg 
aaacagtctg ttgatgccat gctgatgacc ggcgacagct 
ggctatatgc aggcattcgt taagtatggg ctgcgcaacc 



PCT/AU99/00385 

1707 



tctgggctga gttggaaaaa 60 
ctattgcaga gttggctaaa 120 
acgactgcgg taagaagatg 180 
ttaaagaagg ggcgaagttc 240 
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cgtaagagca tcaagaagct actgagtgag tagagattta cacgtctttg tgacgataag 300 
ccagaaaaaa tagcggcagt taacatccag gcttctatgc tttaagcaat ggaatgttac 360 
tgccgttttt tatgaaaaat gaccaataat aacaagttaa cctaccaagt ttaatctgct 420 
ttttgttgga ttttttcttg tttctggtcg catttggtaa gacaattagc gtgagtttta 480 
gagagttttg cgggatctcg cggaactgct cacatctttg gcatttagtt agtgcactgg 540 
tagctgttaa gccaggggcg gtagcttgcc taattaattt ttaacgtata catttattct 600 
tgccgcttat agcaaataaa gtcaatcgga ttaaacttct tttccattag gtaaaagagt 660 
gtttgtagtc gctcagggaa attggttttg gtagtagtac ttttcaaatt atccattttc 72 0 
cgatttagat ggcagttgat gttactatgc tgcatacata tcaatgtata ttatttactt 7 8.0 
ttagaatgtg atatgaaaaa aatagtgatc ataggcaatg tagcgtcaat gatgttaagg 840 
ttcaggaaag aattaatcat gaatttagtg aggcaaggtg ataatgtata ttgtctagca 900 
aatgattttt ccactgaaga tcttaaagta ctttcgtcat ggggcgttaa gggggttaaa 9 60 
ttctctctta actcaaaggg tattaatcct tttaaggata taattgctgt ttatgaacta 1020 
aaaaaaattc ttaaggatat ttccccagat attgtatttt catattttgt aaagccagta 1080 
atatttggaa ctattgcttc aaagttgtca aaagtgccaa ggattgttgg aatgattgaa 1140 
ggtctaggta atgccttcac ttattataag ggaaagcaga ccacaaaaac taaaatgata 1200 
aagtggatac aaattctttt atataagtta gcattaccga tgcttgatga tttgattcta 12 60 
ttaaatcatg atgataaaaa agatttaatc gatcagtata atattaaagc taaggtaaca 1320 
gtgttaggtg ggattggatt ggatcttaat gagttttcat ataaagagcc accgaaagag 13 80 
aaaattacct ttatttttat agcaaggtta ttaagagaga aagggatatt tgagtttatt 1440 
gaagccgcaa agttcgttaa gacaacttat ccaagttctg aatttgtaat tttaggaggt 1500 
tttgagagta ataatccttt ctcattacaa aaaaatgaaa ttgaatcgct aagaaaagaa 1560 
catgatctta tttatcctgg tcatgtggaa aatgttcaag attggttaga gaaaagttct 1620 
gtttttgttt tacctacatc atatcgagaa ggcgtaccaa gggtgatcca agaagctatg 1680 
gctattggta gacctgtaat aacaactaat gtacctgggt gtagggatat aataaatgat 1740 
ggggtcaatg gctttttgat acctccattt gaaattaatt tactggcaga aaaaatgaaa 1800 
tattttattg agaataaaga taaagtactc gaaatggggc ttgctggaag gaagtttgca 1860 
gaaaaaaact ttgatgcttt tgaaaaaaat aatagactag catcaataat aaaatcaaat 192 0 
aatgattttt gacttgagca gaaattattt atatttcaat ctgaaaaata aaggctgtta 1980 
ttatgaataa agtggcatta attactggta tcactgggca agatggctcc tatttggcag 2 040 
aattattgtt agaaaaaggt tatgaagttc atggtattaa acgccgtgca tcttcattta 2100 
atactgagcg agtggatcac atctatcagg attcacattt agctaatcct aaactttttc 2160 
tacactatgg cgatttgaca gatacttcca atctgacccg tattttaaaa gaagttcaac 2220 
cagatgaagt ttacaatttg ggggcgatga gccatgtagc ggtatcattt gagtcaccag 22 80 
aatacactgc tgatgttgat gcgataggaa cattgcgtct tcttgaagct atcaggatat 2340 
tggggctgga aaaaaagaca aaattttatc aggcttcaac ttcagagctt tatggtttgg 2400 
ttcaagaaat tccacaaaaa gagactacgc cattttatcc acgttcgcct tatgctgttg 2460 
caaaattata tgcctattgg atcactgtta attatcgtga gtcttatggt atgtttgcct 2520 
gcaatggtat tctctttaac cacgaatcac ctcgccgtgg cgagaccttt gttactcgta 25 80 
aaataacacg cgggatagca aatattgctc aaggtcttga taaatgctta tacttgggaa 2640 
atatggattc tctgcgtgat tggggacatg ctaaggatta tgtcaaaatg caatggatga 2700 
t&ctgcagca agaaactcca gaagattttg taattgctac aggaattcaa tattctgtcc 27 60 
gtgagtttgt cacaatggcg gcagagcaag taggcataga gttagcattt gaaggtgagg 2820 
gagtaaatga aaaaggtgtt gttgtttcgg tcaatggcac tgatgctaaa gctgtaaacc 2880 
cgggcgatgt aattatatct gtagatccaa ggtattttag gcctgcagaa gttgaaacct 2940 
tgcttggcga tcctactaat gcgcataaaa aattaggatg gagccctgaa attacattgc 3000 
gtgaaatggt aaaagaaatg gtttccagcg atttagcaat agcgaaaaag aacgtcttgc 3060 
tgaaagctaa taacattgcc actaatattc cgcaagaata aaaaagataa tacattaaat 312 0 
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aattaaaaat ggtgctagat ttattagtac cattattttt ttttgggtga ctaatgttta 3180 
ttacatcaga taaatttaga gaaattatca agttagttcc attagtatca attgatctgc 3240 
taattgaaaa cgagaatggt gaatatttat ttggtcttag gaataatcga ccggccaaaa 3300 
attatttttt tgttccaggt ggtaggattc gcaaaaatga atctattaaa aatgctttta 33 60 
aaagaatatc atctatggaa ttaggtaaag agtatggtat ttcaggaagt gtttttaatg 342 0 
gtgtatggga acatttctat gatgatggtt ttttttctga aggcgaggca acacattata 3480 
tagtgctttg ttacacactg aaagttctta aaagtgaatt gaatctccca gatgatcaac 3 540 
atcgtgaata cctttggcta actaaacacc aaataaatgc taaacaagat gttcataact 3 600 
attcaaaaaa ttattttttg taatttttat taaaaattaa tatgcgagag aattgtatgt 3 660 
ctcaatgtct ttaccctgta attattgccg gaggaaccgg aagccgtcta tggccgttgt 3720 
ctcgagtatt ataccctaaa caatttttaa atttagttgg ggattctaca atgttgcaaa 3780 
caacaattac gcgtttggat ggcatcgaat gcgaaaatcc aattgttatc tgcaatgaag 3 840 
atcaccgatt tattgtagca gagcaattac gacagattgg taagctaacc aagaatatta 3 900 
tacttgagcc gaaaggccgt aatactgcac ctgccatagc tttagctgct tttatcgctc 3960 
agaagaataa tcctaatgac gaccctttat tattagtact tgcggcagac cactctataa 4020 
ataatgaaaa agcatttcga gagtcaataa taaaagctat gccgtatgca acttctggga 4080 
agttagtaac atttggaatt attccggaca cggcaaatac tggttatgga tatattaaga 4140 
gaagttcttc agctgatcct aataaagaat tcccagcata taatgttgcg gagtttgtag 4200 
aaaaaccaga tgttaaaaca gcacaggaat atatttcgag tgggaattat tactggaata 42 60 
gcggaatgtt tttatttcgc gccagtaaat atcttgatga actacggaaa tttagaccag 4320 
atatttatca tagctgtgaa tgtgcaaccg ctacagcaaa tatagatatg gactttgtcc 43 80 
gaattaacga ggctgagttt attaattgtc ctgaagagtc tatcgattat gctgtgatgg 4440 
aaaaaacaaa agacgctgta gttcttccga tagatattgg ctggaatgac gtgggttctt 4500 
ggtcatcact ttgggatata agccaaaagg attgccatgg taatgtgtgc catggggatg 4560 
tgctcaatca tgatggagaa aatagtttta tttactctga gtcaagtctg gttgcgacag 4620 
tcggagtaag taatttagta attgtccaaa ccaaggatgc tgtactggtt gcggaccgtg 4680 
ataaagtcca aaatgttaaa aacatagttg acgatctaaa aaagagaaaa cgtgctgaat 4740 
actacatgca tcgtgcagtt tttcgccctt ggggtaaatt cgatgcaata gaccaaggcg 4800 
atagatatag agtaaaaaaa ataatagtta aaccaggaga agggttagat ttaaggatgc 4860 
atcatcatag ggcagagcat tggattgttg tatccggtac tgctaaagtt tcactaggta 4920 
gtgaagttaa actattagtt tctaatgagt ctatatatat ccctcaggga gcaaaatata 4980 
gtcttgagaa tccaggcgta atacctttgc atctaattga agtaagttct ggtgattacc 5040 
ttgaatcaga tgatatagtg cgttttactg acagatataa cagtaaacaa ttcctaaagc 5100 
gagattgata aatatgaata aaataacttg cttcaaagca tatgatatac gtgggcgtct 5160 
tggtgctgaa ttgaatgatg aaatagcata tagaattggt cgcgcttatg gtgagttttt 5220 
taaacctcaa actgtagttg tgggaggaga tgctcgctta acaagtgaga gtttaaagaa 52 80 
atcactctca aatgggctat gtgatgcagg cgtaaatgtc ttagatcttg gaatgtgtgg 5340 
tactgaagag atatattttt ccacttggta tttaggaatt gatggtggaa tcgaggtaac 5400 
tgcaagccat aatccaattg attataatgg aatgaaatta gtaaccaaag gtgctcgacc 5460 
aatcagcagt gacacaggtc tcaaagatat acaacaatta gtagagagta ataattttga 552 0 
agagctcaac ctagaaaaaa aagggaatat taccaaatat tccacccgag atgcctacat 5580 
aaatcatttg atgggctatg ctaatctgca aaaaataaaa aaaatcaaaa tagttgtgaa 5640 
ttctgggaat ggtgcagctg gtcctgttat tgatgctatt gaggaatgct ttttacggaa 5700 
caatattccg attcagtttg taaaaataaa taatacaccc gatggtaatt ttccacatgg 5760 
tatccctaat ccattactac ctgagtgcag agaagatacc agcagtgcgg ttataagaca 5820 
tagtgctgat tttggtattg catttgatgg tgattttgat aggtgttttt tctttgatga 5880 
aaatggacaa tttattgaag gatactacat tgttggttta ttagcggaag tttttttagg 5940 
gaaatatcca aacgcaaaaa tcattcatga tcctcgcctt atatggaata ctattgatat 6000 
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cgtagaaagt catggtggta tacctataat gactaaaacc ggtcatgctt acattaagca 6060 
aagaatgcgt gaagaggatg ccgtatatgg cggcgaaatg agtgcgcatc attattttaa 6120 
agattttgca tactgcgata gtggaatgat tccttggatt ttaatttgtg aacttttgag 6180 
tctgacaaat aaaaaattag gtgaactggt ttgtggttgt ataaacgact ggccggcaag 6240 
tggagaaata aactgtacac tagacaatcc gcaaaatgaa atagataaat tatttaatcg 6300 
ttacaaagat agtgccttag ctgttgatta cactgatgga ttaactatgg agttctctga 63 60 
ttggcgtttt aatgttagat gctcaaatac agaacctgta gtacgattga atgtagaatc 642 0 
taggaataat gctattctta tgcaggaaaa aacagaagaa attctgaatt ttatatcaaa 6480 
ataaatttgc acctgagttc ataatgggaa caagaaatat atgaaagtac ttctgactgg 6540 
ctcaactggc atggttggta agaatatatt agagcatgat agtgcaagta aatataatat 6600 
acttactcca accagctctg atttgaattt attagataaa aatgaaatag aaaaattcat 6660 
gcttatcaac atgccagact gtattataca tgcagcggga ttagttggag gcattcatgc 6720 
aaatataagc aggccgtttg attttctgga aaaaaatttg cagatgggtt taaatttagt 6780 
ttccgtcgca aaaaaactag gtatcaagaa agtgcttaac ttgggtagtt catgcatgta 6840 
ccccaaaaac tttgaagagg ctattcctga gaaagctctg ttaactggtg agctagaaga 6900 
aactaatgag ggatatgcta ttgcgaaaat tgctgtagca aaagcatgcg aatatatatc 69 60 
aagagaaaac tctaattatt tttataaaac aattatccca tgtaatttat atgggaaata 7020 
tgataaattt gatgataact cgtcacatat gattccggca gttataaaaa aaatccatca 7080 
tgcgaaaatt aataatgtcc cagagatcga aatttggggg gatggtaatt cgcgccgtga 7140 
gtttatgtat gcagaagatt tagctgatct tattttttat gttattccta aaatagaatt 72 00 
catgcctaat atggtaaatg ctggtttagg ttacgattat tcaattaatg actattataa 72 60 
gataattgca gaagaaattg gttatactgg gagtttttct catgatttaa caaaaccaac 7320 
aggaatgaaa cggaagctag tagatatttc attgcttaat aaaattggtt ggtcaagtca 73 80 
ctttgaactc agagatggca tcagaaagac ctataattat tacttggaga atcaaaataa 7440 
atgattacat acccacttgc tagtaatact tgggatgaat atgagtatgc agcaatacag 7500 
tcagtaattg actcaaaaat gtttaccatg ggtaaaaagg ttgagttata tgagaaaaat 7560 
tttgctgatt tgtttggtag caaatatgcc gtaatggtta gctctggttc tacagctaat 7 620 
ctgttaatga ttgctgccct tttcttcact aataaaccaa aacttaaaag aggtgatgaa 7680 
ataatagtac ctgcagtgtc atggtctacg acatattacc ctctgcaaca gtatggctta 7740 
aaggtgaagt ttgtcgatat caataaagaa actttaaata ttgatatcga tagtttgaaa 7800 
aatgctattt cagataaaac aaaagcaata ttgacagtaa atttattagg taatcctaat 7860 
gattttgcaa aaataaatga gataataaat aatagggata ttatcttact agaagataac 7920 
tgtgagtcga tgggcgcggt ctttcaaaat aagcaggcag gcacattcgg agttatgggt 7980 
acctttagtt ctttttactc tcatcatata gctacaatgg aagggggctg cgtagttact 8040 
gatgatgaag agctgtatca tgtattgttg tgccttcgag ctcatggttg gacaagaaat 8100 
ttaccaaaag agaatatggt tacaggcact aagagtgatg atattttcga agagtcgttt 8160 
aagtttgttt taccaggata caatgttcgc ccacttgaaa tgagtggtgc tattgggata 8220 
gagcaactta aaaagttacc aggttttata tccaccagac gttccaatgc acaatatttt 8280 
gtagataaat ttaaagatca tccattcctt gatatacaaa aagaagttgg tgaaagtagc 8340 
tggtttggtt tttccttcgt tataaaggag ggagctgcta ttgagaggaa gagtttagta 8400 
aataatctga tctcagcagg cattgaatgc cgaccaattg ttactgggaa ttttctcaaa 8460 
aatgaacgtg ttttgagtta ttttgattac tctgtacatg atacggtagc aaatgccgaa 8520 
tatatagata agaatggttt ttttgtcgga aaccaccaga tacctttgtt taatgaaata 8580 
gattatctac gaaaagtatt aaaataacta acgaggcact ctatttcgaa tagagtgcct 8640 
ttaagatggt attaacagtg aaaaaaattt tagcgtttgg ctattctaaa gtactaccac 8700 
cggttattga acagtttgtc aatccaattt gcatcttcat tatcacacca ctaatactca 8760 
accacctggg taagcaaagc tatggtaatt ggattttatt aattactatt gtatcttttt 8820 
ctcagttaat atgtggagga tgttccgcat ggattgcaaa aatcattgca gaacagagaa 888 0 
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ttcttagtga tttatcaaaa aaaaatgctt tacgtcaaat ttcctataat ttttcaattg 8940 

ttattatcgc atttgcggta ttgatttctt ttcttatatt aagtatttgt ttcttcgatg 9000 

ttgcgaggaa taattcttca ttcttattcg cgattattat ttgtggtttt tttcaggaag 9060 

ttgataattt atttagtggt gcgctaaaag gttttgaaaa atttaatgta tcatgttttt 9120 

ttgaagtaat tacaagagtg ctctgggctt ctatagtaat atatggcatt tacggaaatg 9180 

cactcttata ttttacatgt ttagccttta ccattaaagg tatgctaaaa tatattcttg 9240 

tatgtctgaa tattaccggt tgtttcatca atcctaattt taatagagtt gggattgtta 9300 

atttgttaaa tgagtcaaaa tggatgtttc ttcaattaac tggtggcgtc tcacttagtt 9360 

tgtttgatag gctcgtaata ccattgattt tatctgtcag taaactggct tcttatgtcc 9420 

cttgccttca actagctcaa ttgatgttca ctctttctgc gtctgcaaat caaatattac 9480 

taccaatgtt tgctagaatg aaagcatcta acacatttcc ctctaattgt ttttttaaaa 9540 

ttctgcttgt atcactaatt tctgttttgc cttgtcttgc gttattcttt tttggtcgtg 9600 

atatattatc aatatggata aaccctacat ttgcaactga aaattataaa ttaatgcaaa 9660 

ttttagctat aagttacatt ttattgtcaa tgatgacatc ttttcatttc ttgttattag 9720 

gaattggtaa atctaagctt gttgcaaatt taaatctggt tgcagggctc gcacttgctg 9780 

cttcaacgtt aatcgcagct cattatggcc tttatgcaat atctatggta aaaataatat 9840 

atccggcttt tcaattttat tacctttatg tagcttttgt ctattttaat agagcgaaaa 9900 

atgtctattg atttactttt ttcaattact gaaatcgcaa ttgttttttc ttgcactatt 9960 

tacatattta ctcaatgttt gttaatgcgg aggatctatt tagataaaag tattttaatt 10020 

cttttatgct tgctcttttt tttagtaatc attcaacttc ctgagcttaa tgtaaacggt 10080 

ttggtcgatt ctttaaagtt atcactgcct ttattgatgg tctttatcgc ttttcaaaaa 10140 

ccgaaattat gcttgtgggt tattattgca ttgttgtttt tgaactctgc atttaatttt 10200 

ttatatttaa agacattcga taagtttagc tcatttcctt ttactttttt tatattgctg 10260 

ttttacttgt ttagattggg aattggtaat ttaccggttt ataaaaataa aaaattttac 10320 

gcgttgattt ttctctttat attaatagac ataatgcagt cattgttaat aaattatagg 10380 

gggcagattt tatattccgt aatttgcatc ctgatacttg tgtttaaagt taatttaaga 10440 

aaaaagattc catacttttt tttaatgctg ccagttttat atgtaattat tatggcttat 10500 

attggtttta attatttcaa taaaggcgta actttttttg aacctacagc aagtaatatt 10560 

gaacgtacgg ggatgatata ttatttggtt tcacagcttg gtgattatat attccatggt 10620 

atggggacat taaatttctt aaataacggc ggacaatata agacgttata tggacttcca 10680 

tcattaattc ctaatgaccc tcatgatttt ttattacggt tctttataag tattggtgtg 10740 

ataggagcat tggtttatca ttctatattt tttgtttttt ttaggagaat atctttctta 10800 

ttatatgaga gaaatgctcc tttcattgtt gtaagttgtt tgttactgtt acaagttgtg 10860 

ttaatttata cattaaaccc ttttgatgct tttaatcgat tgatttgcgg gcttacagtt 10920 

ggagttgttt atggatttgc aaaaattaga taagtatacc tgtaatggaa atttagacgc 109 80 

tccacttgtt tcaataatca ttgcaactta taattctgaa cttgatatag ctaagtgttt 11040 

gcaatcggta actaatcaat cttataagaa tattgaaatc ataataatgg atggaggatc 11100 

ttctgataaa acgcttgata ttgcaaaatc gtttaaagac gaccgaataa aaatagtttc 11160 

agagaaagat cgtggaattt atgatgcctg gaataaagca gttgatttat ccattggtga 11220 

ttgggtagca tttattggtt cagatgatgt ttactatcat acagatgcaa ttgcttcatt 11280 

gatgaagggg gttatggtat ctaatggcgc ccctgtggtt tatgggagga cagcgcacga 113 40 

aggtcccgat aggaacatat ctggatitttc aggcagtgaa tggtacaacc taacaggatt 11400 

taagtttaat tattacaaat gtaatttacc attgcccatt atgagcgcaa tatattctcg 11460 

tgatttcttc agaaacgaac gttttgatat taaattaaaa attgttgctg acgctgattg 11520 

gtttctgaga tgtttcatca aatggagtaa agagaagtca ccttatttta ttaatgacac 11580 

gacccctatt gttagaatgg gatatggtgg ggtttcgact gatatttctt ctcaagttaa 11640 

aactacgcta gaaagtttca ttgtacgcaa aaagaataat atatcctgtt taaacataca 11700 

gctgattctt agatatgcta aaattctggt gatggtagcg atcaaaaata tttttggcaa 11760 
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taatgtttat aaattaatgc ataacgggta tcattcccta aagaaaatca agaataaaat 11820 

atgaagattg tttatataat aaccgggctt acttgtggtg gagccgaaca ccttatgacg 11880 

cagttagcag accaaatgtt tatacgcggg catgatgtta atattatttg tctaactggt 11940 

atatctgagg taaagccaac acaaaatatt aatattcatt atgttaatat ggataaaaat 12000 

tttagaagct tttttagagc tttatttcaa gtaaaaaaaa taattgtcgc cttaaagcca 12060 

gatataatac atagtcatat gtttcatgct aatattttta gtcgttttat taggatgctg 12120 

attccagcgg tgcccctgat atgtaccgca cacaacaaaa atgaaggtgg caatgcaagg 12180 

atgttttgtt atcgactgag tgatttttta gcttctatta ctacaaatgt aagtaaagag 12240 

gctgttcaag agtttatagc aagaaaggct acacctaaaa ataaaatagt agagattccg 12300 

aattttatta atacaaataa atttgatttt gatattaatg tcagaaagaa aacgcgagat 123 60 

gcttttaatt tgaaagacag tacagcagta ctgctcgcag taggaagact tgttgaagca 12420 

aaagactatc cgaacttatt aaatgcaata aatcatttga ttctttcaaa aacatcaaat 12480 

tgtaatgatt ttattttgct tattgctggc gatggcgcat taagaaataa attattggat 12540 

ttggtttgtc aattgaatct tgtggataaa gttttcttct tggggcaaag aagtgatatt 12 600 

aaagaattaa tgtgtgctgc agatcttttt gttttgagtt ctgagtggga aggttttggt 12660 

ctcgttgttg cagaagctat ggcgtgtgaa cgtcccgttg ttgctaccga ttctggtgga 12720 

gttaaagaag tcgttggacc tcataatgat gttatccctg tcagtaatca tattctgttg 12780 

gcagagaaaa tcgctgagac acttaaaata gatgataacg caagaaaaat aataggtatg 12840 

aaaaatagag aatatattgt ttccaatttt tcaattaaaa cgatagtgag tgagtgggag 12900 

cgcttatatt ttaaatattc caagcgtaat aatataattg attgaaaata taagtttgta 12960 

ctctggatgc aatagtttct ctatgctgtt tttttactgg ctccgtattt ttacttatag 13 02 0 

ctggattttg ttatatatca gtattaatct gtctcaactt catctagact acattcaagc 13080 

cgcgcatgcg tcgcgcggtg actacacctg acaggagtat gtaatgtcca agcaacagat 13140 

cggcgtcgtc ggtatggcag tgatggggcg caacctggcg ctcaacatcg aaagccgcgg 13200 

ttataccgtc tccatcttca accgctcccg cgagaaaact gaagaagttg ttgccgagaa 13260 

cccggataag aaactggttc cttattacac ggtgaaagag ttcgtcgagt ctcttgaaac 13320 

cccacgtcgt atcctgttaa tggtaaaagc aggggcggga actgatgctg ctatcgattc 13380 

cctgaagccg tatctggata aaggcgacat cattattgat ggtggcaaca ccttcttcca 13440 

ggacactatc cgtcgtaacc gtgaactgtc cgcggaaggc tttaacttca tcggtaccgg 13 500 

cgtgtccggc ggtgaagagg gcgccctgaa aggcccatct atcatgccag gtggccagaa 13 560 

agaagcgtat gagctggttg cgcctatcct gaccaagatt gctgcggttg ctgaagatgg 13 620 

cgaaccatgt ataacttaca tcggtgctga cggtgcgggt cactacgtga agatggtgca 13680 

caacggtatc gaatatggcg atatgcagct gattgctgaa gcctattctc tgcttaaagg 13740 

cggccttaat ctgtctaacg aagagctggc aaccactttt accgagtgga atgaaggcga 13800 

gctaagtagc tacctgattg acatcaccaa agacatcttc accaaaaaag atgaagaggg 13 860 

taaatacctg gttgatgtga tcctggacga agctgcgaac aaaggcaccg gtaaatggac 13920 

cagccagagc tctctggatc tgggtgaacc gctgtcgctg atcaccgaat ccgtattcgc 13980 

tcgctacatc tcttctctga aagaccagcg cattgcggca tctaaagtgc tgtctggtcc 14040 

gcaggctaaa ctggctggtg ataaagcaga gttcgttgag aaagtccgtc gcgcgctgta 14100 

cctgggtaaa atcgtctctt atgcccaagg cttctctcaa ctgcgtgccg cgtctgacga 14160 

atacaactgg gatctgaact acggcgaaat cgcgaagatc ttccgcgcgg gctgcatcat 14220 

tcgtgcgcag ttcctgcaga aaattactga cgcgtatgct gaaaacaaag gcattgctaa 14280 

cctgttgctg gctccgtact tcaaaaatat cgctgatgaa tatcagcaag cgctgcgtga 14340 

tgtagtggct tatgctgtgc agaacggtat tccggtaccg accttctctg cagcggtagc 14400 

ctactacgac agctaccgtt ctgcggtact gccggctaat ctgattcagg cacagcgtga 14460 

ttacttcggt gcgcacacgt ataaacgcac tgataaagaa ggtgtgttcc acaccg 14516 

<210> 46 
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<211> 1380 
<212> DNA 

<213> Escherichia coli 
<400> 46 

aacaaatctc agtcttctct tagctctgct attgagcgtc tgtcttctgg tctgcgtatt 60 

aacagcgcaa aagacgatgc agcaggtcag gcgattgcta accgttttac ggcaaatatt 120 

aaaggtctga cccaggcttc ccgtaacgcg aatgatggta tttctgttgc gcagaccact 180 

gaaggtgcgc tgaatgaaat taacaacaac ctgcagcgta ttcgtgaact ttctgttcag 240 

gcaactaacg gtactaactc tgacagcgat ctttcttcta tccaggctga aattactcaa 300 

cgtctggaag aaattgaccg tgtatctgag caaactcagt ttaacggcgt gaaagtcctt 3 60 

gctgaaaata atgaaatgaa aattcaggtt ggtgctaatg atggtgaaac catcactatc 420 

aatctggcaa aaattgatgc gaaaactctc ggcctggacg gttttaatat cgatggcgcg 480 

cagaaagcaa ccggcagtga cctgatttct aaatttaaag cgacaggtac tgataattat 540 

caaattaacg gtactgataa ctatactgtt aatgtagata gtggagtagt acaggataaa 600 

gatggcaaac aagtttatgt gagtgctgcg gatggttcac ttacgaccag cagtgatact 660 

caattcaaga ttgatgcaac taagcttgca gtggctgcta aagatttagc tcaaggtaat 720 

aagattgtct acgaaggtat cgaatttaca aataccggca ctggcgctat acctgccaca 780 

ggtaatggtg aattaaccgc caatgttgat ggtaaggctg ttgaattcac tatttcgggg 840 

agtgctgata catcaggtac tagtgcaacc gttgccccta cgacagccct atacaaaaat 900 

agtgcagggc aattgactgc aacaaaagtt gaaaataaag cagcgacact atctgatctt 960 

gatctgaacg ctgccaagaa aacaggaagc acgttagttg ttaacggtgc aacttacgat 1020 

gttagtgcag atggtaaaac gataacggag actgcttctg gtaacaataa agtcatgtat 1080 

ctgagcaaat cagaaggtgg tagcccgatt ctggtaaacg aagatgcagc aaaatcgttg 1140 

caatctacca ccaacccgct cgaaactatc gacaaagcat tggctaaagt tgacaatctg 1200 

cgttctgacc tcggtgcagt acaaaaccgt ttcgactctg ccatcaccaa ccttggcaac 1260 

accgtaaaca acctgtcttc tgcccgtagc cgtatcgaag atgctgacta cgcgaccgaa 1320 

gtgtctaaca tgtctcgtgc gcagatcctg caacaagcgg gtacctctgt tctggcacag 13 80 

<210> 47 
<211> 1497 
<212> DNA 

<213> Escherichia coli 
<400> 47 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgcagcggg tcaggcgatt gctaaccgtt tcacctctaa cattaaaggc 180 
ctgactcagg cggcccgtaa cgccaacgac ggtatctccg ttgcgcagac caccgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtgtgcgtg aactgacggt acaggccact 3 00 
accggtacta actctgagtc tgatctgtct tctatccagg acgaaattaa atcccgtctg 360 
gatgaaattg accgcgtatc tggtcagacc cagttcaacg gcgtgaacgt gctggcaaaa 420 
aatggctcca tgaaaatcca ggttggcgca aatgataacc agactatcac tatcgatctg 480 
aagcagattg atgctaaaac tcttggcctt gatggtttta gcgttaaaaa taacgataca 540 
gttaccacta gtgctccagt aactgctttt ggtgctacca ccacaaacaa tattaaactt 600 
actggaatta ccctttctac ggaagcagcc actgatactg gcggaactaa cccagcttca 660 
attgagggtg tttatactga taatggtaat gattactatg cgaaaatcac cggtggtgat 72 0 
aacgatggga agtattacgc agtaacagtt: gctaatgatg gtacagtgac aatggcgact 7 80 
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ggagcaacgg caaatgcaac tgtaactgat gcaaatacta ctaaagctac aactatcact 840 

tcaggcggta cacctgttca gattgataat actgcaggtt ccgcaactgc caaccttggt 900 

gctgttagct tagtaaaact gcaggattcc aagggtaatg ataccgatac atatgcgctt 960 

aaagatacaa atggcaatct ttacgctgcg gatgtgaatg aaactactgg tgctgtttct 1020 

gttaaaacta ttacctatac tgactcttcc ggtgccgcca gttctccaac cgcggtcaaa 1080 

ctgggcggag atgatggcaa aacagaagtg gtcgatattg atggtaaaac atacgattct 1140 

gccgatttaa atggcggtaa tctgcaaaca ggtttgactg ctggtggtga ggctctgact 1200 

gctgttgcaa atggtaaaac cacggatccg ctgaaagcgc tggacgatgc tatcgcatct 1260 

gtagacaaat tccgttcttc cctcggtgcg gtgcaaaacc gtctggattc cgcggttacc 1320 

aacctgaaca acaccactac caacctgtct gaagcgcagt cccgtattca ggacgccgac 1380 

tatgcgaccg aagtgtccaa tatgtcgaaa gcgcagatca tccagcaggc cggtaactcc 1440 

gtgttggcaa aagctaacca ggtaccgcag caggttctgt ctctgctgca gggttaa 1497 



<210> 48 
<211> 1695 
<212> DNA 

<213> Escherichia coli 



<400> 48 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtctg aaatcaacaa caacttacag 
accgggacta actctgattc ggatctggac 
gacgaaattg accgcgtatc cggtcaaacc 
gacggttcga tgaaaattca ggttggtgcg 
aagaaaattg actctgatac gctggggctg 
attgcgaaca aagcggcaac cattagtgat 
tcaagcaata ttgttgtcac gacaaagttc 
aaactcaaag atggtgattc tgttgccgtt 
accaatgatt ttacgacaga aaatacagta 
gctactctga aggctgctgc tgggcagagt 
aaagttaact ttgatgttga tgcaagcggt 
ttggttggtg gagcgctgac tactaacgat 
tccctgttta aggccgcgga tgacaaagat 
aaaaaatacg aatttgctgg tggcaattct 
acggtgtctt ctgacgcgct tttggctcag 
aaaatcacct ttaacaatgg tcctctgtca 
ggctccgcgg catcgaatgc agcctacatt 
tcctacaaca caaattattc cgtagacaaa 
agcggtacgg gtaaatacgc cgcaaacgtg 
aaattaacca cgaatactac tagtaccggc 
gatgaggcaa ttgcatccat cgacaaattc 
ctggattccg cagtcaccaa cctgaacaac 
cgtattcagg acgccgacta tgcgaccgaa 
cagcaggccg gtaactccgt gttggcaaaa 
ctgctgcagg gttaa 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcgcagac caccgaaggc 240 
cgtattcgtg aactgacggt tcaggcttct 300 
tccattcagg acgaaatcaa atcccgtctg 360 
cagttcaacg gtgtgaacgt actggcgaaa 420 
aatgacggcc agactatcac tattgatctg 480 
aatggtttta acgttaacgg caaaggtact 540 
ctggcggcga cgggggcgaa tgttactaac 600 
aatgccttgg atgcagcgac tgcatttagc 660 
gctgctcaga aatatactta taacgcatcg 720 
gcgacaggca ctgcaacgac agate ttggc 780 
caatcaggta catatacctt tgcaaatggt 840 
aatatcacta ttggcggcga aaaggctttc 900 
cccaccggct ccactccagc aacgatgtct 960 
gccgctcaat cctcgattga ttttggcggg 1020 
actaatggtg gcggcgttaa attcaaagac 1080 
gttaaagcgg atagtactgc taataatgta 1140 
ttcactgcat cgttccaaaa tggtgtatct 1200 
gatagcgaag gcgaactgac aactactgaa 1260 
gacacggggg ctgtaagtgt tacagggggg 1320 
ggtgctcagg cttatgtagg :-tgcagatggt 13 80 
tctgcaacca aagatccact aaatgcgctg 1440 
cgttcttccc tgggggctat ccagaaccgt 1500 
accactacca acctgtctga agcgcagtcc 1560 
gtgtccaaca tgtcgaaagc gcagatcatc 162 0 
gctaaccagg taccgcagca ggttctgtct 1680 

1695 
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<210> 49 
<211> 1164 
<212> DNA 

<213> Escherichia coli 
<400> 49 

aacaagaacc agtctgcgct gtcgagttct atcgagcgtc tgtcttctgg cttgcgtatt 60 

aacagcgcga aggatgacgc cgcgggtcag gcgattgcta accgttttac ttctaacatt 12 0 

aaaggcctga ctcaggctgc acgtaacgcc aacgacggta tttctgttgc gcagaccacc 180 

gaaggcgcgc tgtccgaaat taacaacaac ttacagcgtg tgcgtgagct gactgttcag 240 

gcgaccaccg gtactaactc tgagtctgac ctgtcttcta tccaggacga aatcaaatct 3 00 

cgcctggaag agattgatcg tgtttcaagt cagactcaat ttaacggcgt gaatgttttg 3 60 

gctaaagatg ggaaaatgaa cattcaggtt ggggcaagtg atggacagac tatcactatt 42 0 

gatctgaaaa agatcgattc atctacacta aacctctcca gttttgatgc tacaaacttg 480 

ggcaccagtg ttaaagatgg ggccaccatc aataagcaag tggcagtaga tgctggcgac 540 

tttaaagata aagcttcagg atcgttaggt accctaaaat tagttgagaa agacggtaag 600 

tactatgtaa atgacactaa aagtagtaag tactacgatg ccgaagtaga tactagtaag 660 

ggtgaaatta acttcaactc tacaaatgaa agtggaacta ctcctactgc agcgacggaa 720 

gtaactactg ttggccgcga tgtaaaattg gatgcttctg cacttaaagc caaccaatcg 780 

cttgtcgtgt ataaagataa aagcggcaat gatgcttata tcattcagac caaagatgta 840 

acaactaatc aatcaacttt caatgccgct aatatcagtg atgctggtgt tttatctatt 900 
ggtgcatcta caaccgcgcc aagcaattta acagctgacc cgcttaaggc tcttgatgat 960 

gcaattgcat ctgttgataa attccgctct tctctcggtg ccgttcagaa ccgtctggat 102 0 
tctgccattg ccaacctgaa caacaccact accaacctgt ctgaagcgca gtcccgtatt 1080 
caggacgctg actatgcgac cgaagtgtcc aacatgtcga aagcgcagat tatccagcag 1140 
gccggtaact ccgtgctggc aaaa 1164 

<210> 50 
<211> 1818 
<212> DNA 

<213> Escherichia coli 



atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 

aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 

gcgaaggatg acgcagcggg tcaggcgatt gctaaccgtt tcacctctaa cattaaaggc 180 

ctgactcagg ctgcacgtaa cgctaacgat ggtatctctc tggcgcagac cactgaaggc 240 

gcactgtctg agattaacaa caacttacaa cgtgtgcgtg agttgactgt acaggcgacc 300 

accggtacta actctgattc tgacctggct tctattcagg acgaaatcaa atcccgtttg 3 60 

tctgaaattg accgcgtatc cgggcagacc cagttcaacg gcgtgaacgt attgtctaaa 420 

gatggctccc tgaaaattca ggttggcgca aatgatggtc agactatctc tatcgacctg 480 

aagaaaattg actctgatac tctgggtttg aatggtttca acgttaatgg ttctggtacc 540 

attgcaaaca aagcggccac aatcagtgac ttgactgctc agaaagccgt tgacaacggt 600 

aatggtactt ataaagttac aactagcaac gctgcactta ctgcatctca ggcattaagt 660 

aagctgagtg atggcgatac tgtagatatt gcaacctatg ctggtggtac aagttcaaca 720 

gttagttata aatacgacgc agatgcaggt aacttcagtt ataacaatac tgcaaacaaa 7 80 
acaagtgctg cggctggaac tctggcagat actcttctcc cggcagctgg ccagactaaa 840 



<400> 50 
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accggtactt acaaggctgc tactggtgat gttaacttta atgttgacgc aactggtaat 900 
ctgacaattg gcggacagca agcctacctg actactgatg gtaaccttac aacaaacaac 960 
tccggtggtg cggctactgc aactcttaaa gagctgttta ctcttgctgg cgatggtaaa 1020 
tctctgggga acggcggtac tgctaccgtt actctggata atactacgta taatttcaaa 1080 
gctgctgcga acgttactga tggtgctggt gtcatcgctg ctgctggtgt aacttataca 1140 
gccactgttt ctaaagatgt cattctggca caactgcaat ctgcaagtca ggcagcagca 1200 
accgctaccg acggtgatac tgtcgcaacg atcaactata aatctggtgt catgatcggt 1260 
tccgctacct ttaccaatgg taaaggtact gccgatggta tgacttctgg tacaactcca 1320 
gtcgtagcta caggtgctaa agctgtatat gttgatggca acaatgaact gacttccact 13 80 
gcatcttacg atacgactta ctctgtcaac gcagatacag gcgcagtaaa agtggtatca 1440 
ggtactggta ctggtaaatt tgaagctgtt gctggtgcgg atgcttatgt aagcaaagat 1500 
ggcaaattaa cgacagaaac caccagtgca ggcactgcaa ccaaagatcc tttggctgcc 1560 
ctggatgctg ctatcagctc catcgacaaa ttccgttcct ccctgggtgc tatccagaac 1620 
cgtctggatt ccgcagtcac caacctgaac aacaccacta ctaacctgtc tgaagcgcag 1680 
tcccgtattc aggacgccga ctatgcgacc gaagtgtcca atatgtcgaa agcgcagatc 1740 
atccagcagg ccggtaactc tgtgttggca aaagctaacc aggtaccgca gcaggttctg 1800 
tctctgctgc agggttaa 1818 



<210> 51 
<211> 1344 
<212> DNA 

<213> Escherichia coli 



<400> 51 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
accgggacta actctgattc ggatctggac 
gacgaaattg accgcgtttc cggtcagacc 
gacggttcga tgaagattca ggttggcgcg 
cagaaaattg attcttcaac gctgggattg 
aaagttagcg atgcgataac tacagttcct 
gttaaatttg gtgcgaacga taccgctgct 
gatacatcag gcttgtccct acataacgta 
tatgttgttc aatctggtaa tgacttctat 
acgcttaata ccaccaatgt tactttcact 
cagacaggtc agcctatcaa ggtcacgacg 
actattcaag gcaaagatta ccttgctggt. 
ggtgacgctg caacaaatga agacacaaaa 
ggttctgtaa aaacagcggc aacagcaaca 
gcacttttag acaaagctat ctcgcaagtt 
caaaaccgtc tggattctgc ggtcaccaac 
gcgcagt ccc gtattcagga cgccgactat 
cagatcatcc agcaggcggg taactctgtg 
gttctgtctc tgctgcaggg ttaa 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 12 0 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcacagac cactgaaggc 240 
cgtattcgtg aactgacggt tcaggcttct 300 
tccattcagg acgaaatcaa atcccgtctc 3 60 
cagttcaacg gcgtgaacgt gctggcgaaa 420 
aatgacgggc agaccatctc tatcgatttg 480 
aaaggtttct cggtatcagg gaacgcatta 540 
ggtgctaatg ctggcgatgc cccggttacg 600 
gccgcaatgg ctaaaacatt gggaataagt 660 
caaagcgcgg atggtaaagc gacaggaacc 720 
tcggcttccg ttaatgctgg tggcgttgtt 780 
gatcctgcga acggtgttac cacagcaaca 840 
aatagtgctg gcgcggctgt tggctatgtt 900 
gcagacggta aggatgcaat tgaaaacggt 960 
atccaactta ccgatgaact cgatgttgat 1020 
ttttctggta ctgcaaccaa cgatccgctg 1080 
gatactttcc gctcctccct cggtgccgta 1140 
ctgaataaca ccaccaccaa cctgtctgaa 1200 
gcgaccgaag tgtccaacat gtcgaaagcg 1260 
ctgtctaaag ctaaccaggt accgcagcag 1320 

1344 
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<210> 52 - 39 - 

<211> 2599 
<212> DNA 

<213> Escherichia coli 
<400> 52 

cttctcttag ctctgctatt gagcgtctgt cttctggtct gcgtattaac agcgcaaaag 60 
acgatgcagc aggtcaggcg attgctaacc gttttacggc aaatattaaa ggtctgaccc 120 
aggcttcccg taacgcgaat gatggtattt ctgttgcgca gaccactgaa ggtgcgctga 180 
atgaaattaa caacaacctg cagcgtattc gtgaactttc tgttcaggca actaacggta 240 
ctaactctga cagcgatctt tcttctatcc aggctgaaat tactcaacgt ctggaagaaa 3 00 
ttgaccgtgt atctgagcaa actcagttta acggcgtgaa agtccttgct gaaaataatg 3 60 
aaatgaaaat tcaggttggt gctaatgatg gtgaaaccat tgacctgccc ccacgattag 42 0 
atacaacact cagttagtaa cgtcggaatc ttcattctca gaatgaccct ttctccagcc 48 0 
cgctgcaaat tcagacggtg tctgataatt cagcgtggag tgcgggcggc attcgttata 540 
atcctgccgc cagtcattaa taattttcct ggcatgaacg atatcgctga accagtgctc 600 
attcaaacat tcatcgcgaa atcgtccgtt aaagctctca ataaatccgt tctgcgttgg 660 
cttgcccggc tggattaagc gcaactcaac accatgctca aaggcccatt gatccagtgc 720 
acggcaagtg aactccggcc cctggtcagt tcttatcgtc gccggatagc ctcgaaacag 780 
tgcaatgctg tccagaatac gcgtgacctg aacgcctgaa atcccaaagg caacagtgac 840 
cgtcaggcat tcctttgtga aatcatcgac gcaggtaaga cacttgatcc tgcgaccggt 900 
ggaaagtgcg tccatgacga aatccatcga ccaggtcaga ttgggcgccg ccggacggag 960 
cagcggcaga cgttctgttg ccagcccttt acgacgtctt ctgcgtttta cgcccaggcc 1020 
actgaggtga taaagccggt acacgcgctt atgattaaca tgaagccctt cacggcgcag 1080 
caactgccaa atacgacggt agccaaaacg cctgcgctcc agtgccagct cagtgatgcg 1140 
ccctgataaa tgcgcatcag cagccggacg gtgagcctca tagcggcagg tcgacaggga 1200 
taaacctgta agcctgcagg cacgacgttg cgacagaccg gtcgcatcac acatcaacat 1260 
cacggcttcc cgcttctggt ctgtcgtcag tactttcgcc caagagccac ctgaagcgcc 1320 
tctttatcca gcatggcttc ggcaagcagc ttcttgagtc tggtgttctc ttcctcaagc 1380 
gacttcaggc gcttaacttc aggcacctcc ataccgccat acttcttacg ccaggtgtaa 1440 
aacgtggcat cggaaatggc atgcttgcgg cagagttcac gggcgggtac cccagcttcg 1500 
gcttcgcgga gaatactgat gatctgttcg tcggaaaaac gcttcttcat ggggatgtcc 1560 
tcatgtggct tatgaagaca ttactaacat cggggtgtac taatcaacgg ggagcaggtc 1620 
accatcacta tcaatctggc aaaaattgat gcgaaaactc tcggcctgga cggttttaat 1680 
atcgatggcg cgcagaaagc aaccggcagt gacctgattt ctaaatttaa agcgacaggt 1740 
actgataatt atcaaattaa cggtactgat aactatactg ttaatgtaga tagtggagta 1800 
gtacaggata aagatggcaa acaagtttat gtgagtgctg cggatggttc acttacgacc 1860 
agcagtgata ctcaattcaa gattgatgca actaagcttg cagtggctgc taaagattta 1920 
gctcaaggta ataagattgt ctacgaaggt atcgaattta caaataccgg cactggcgct 1980 
atacctgcca caggtaatgg taaattaacc gccaatgttg atggtaaggc tgttgaattc 2 040 
actatttcgg ggagtgctga tacatcaggt actagtgcaa ccgttgcccc tacgacagcc 2100 
ctatacaaaa atagtgcagg gcaattgact gcaacaaaag ttgaaaataa agcagcgaca 2160 
ctatctgatc ttgatctgaa cgctgccaag aaaacaggaa gcacgttagt tgttaacggt 2220 
gcaacttacg atgttagtgc agatggtaaa acgataacgg agactgcttc tggtaacaat 2280 
aaagtcatgt atctgagcaa atcagaaggt ggtagcccga ttctggtaaa cgaagatgca 2340 
gcaaaatcgt tgcaatctac caccaacccg ctcgaaacta tcgacaaagc attggctaaa 2400 
gttgacaatc tgcgttctga cctcggtgca gtacaaaacc gtttcgactc tgccatcacc 2460 
aaccttggca acaccgtaaa caacctgtct tctgcccgta gccgtatcga agatgctgac 2520 
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tacgcgaccg aagtgtctaa catgtctcgt gcgcagatcc tgcaacaagc gggtacctct 2580 
gttctggcac aggctaacc 2599 

<210> 53 
<211> 1245 
<212> DNA 

<213> Escherichia coli 
<400> 53 

aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tctcttctgg tctgcgcatt 60 
aacagcgcta aagatgacgc tgcgggccag gcgattgcta accgcttcac ttctaacatc 12 0 
aaaggtctga ctcaggccgc acgtaacgcc aacgacggta tctctctggc gcagaccact 180 
gaaggcgcac tgtctgaaat caacaacaac ttgcagcgtg ttcgtgaact gaccgttcag 240 
gccactaccg gtactaactc tgattctgac ctgtcttcaa tccaggacga aatcaaatcc 300 
cgtctcgatg aaattgaccg cgtatccggt cagactcagt tcaacggcgt gaacgtactg 3 60 
gcaaaagatg gctcgatgaa aattcaggtc ggtgcaaatg atggtcagac aatcagcatt 420 
gatttgcaga agattgattc ttctacttta gggttaaatg gtttttctgt ttccaaaaat 480 
gcagtatctg ttggtgatgc tattactcaa ttgcctggcg agacggcagc cgatgcacca 540 
gtaaccatca agtttgatga ttcagtaaaa actgatttaa aactgaccga tgcttcaggg 600 
ttaagtctgc ataacctcaa agatgaaaat ggtaatttaa ctaaccagta tgttgtacag 660 
aatggcggaa aatcttacgc tgctacagtc gctgccaatg gtaatgttac gctgaacaaa 72 0 
gcaaatgtaa cctacagcga tgtcgcaaac ggtattgata ccgcaacgca gtcaggccag 780 
ttagttcagg ttggtgcaga ttctaccggt acgccaaaag cattcgtgtc tgtccaaggt 84 0 
aaaagctttg gcattgatga cgccgccttg aagaataaca ctggtgatgc taccgctact 900 
ccaccgggaa catctgggac aacagttgtc gcagcgtcaa ttcatctgag tacgggcaaa 960 
aactctgtag acgctgatgt aacggcttcc actgaattca caggtgcttc aaccaacgat 1020 
ccactgactc tgctggacaa agctatcgca tctgttgata aattccgttc ttctttgggg 1080 
gcggtacaga accgtctgag ctccgctgta accaacctga acaacaccac caccaacctg 1140 
tctgaagcgc agtcccgtat tcaggacgcc gactatgcga ccgaagtgtc caacatgtcg 12 00 
aaagcgcaga ttatccagca ggcaggtaac tccgtgctgt ccaaa 12 45 

<210> 54 
<211> 1212 
<212> DNA 

<213> Escherichia coli 
<400> 54 

aacaaaaacc agtctgcgct gtcgacttct atcgaacgcc tctcttctgg cctgcgtatt 60 

aacagtgcga aagatgacgc tgccggtcag gcgatagcta accgtttcac ctctaacatt 120 

aaaggcctga ctcaggctgc gcgtaacgcc aacgacggta tttctctggc gcagaccaca 180 

gaaggtgcgt tgtctgaaat caacaacaac ttgcaacgtg tgcgtgagtt gaccgttcag 240 

gcgacgaccg gtactaactc tgattctgac ctgtcatcta Gtcaggacga aatcaaatcc 3 00 

cgtctggatg agattgaccg tgtttccggt cagacccagt tcaacggcgt gaatgtactg 360 

gcaaaagacg gttcgatgaa gattcaggtt ggcgcgaatg atggccagac tattagcatt 420 

gatttacaga aaattgactc ttctacatta gggttgaatg gtttctccgt ttctgctcaa 480 

tcacttaacg ttggtgattc aattactcaa attacaggag ccgctgggac aaaacctgtt 540 

ggtgttgatt tcactgctgt tgcgaaagat ctgactactg cgacaggtaa aactgtcgat 600 

gtttccagcc tgacgttaca caacaccctg gatgcgaaag gggctgccac cgcacagttc 660 
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gtcgttcaat ccggtagtga tttctactcc gcgtccattg accatgcaag tggtgaagtg 72 0 

acgttgaata aagccgatgt cgaatacaaa gacaccgata atggactaac gactgcagct 7 80 

actcagaaag atcagctgat taaagttgcc gctgactctg acggcgcggc tgcgggatat 840 

gtaacattcc agggtaaaaa ctacgctaca acggctccag cggcgcttaa tgatgacact 900 

acggcaacag ccacagcgaa caaagttgtt gttgaattat ctacagcaac tccgactgcg 960 

cagttctcag gggcttcttc tgctgatcca ctggcacttt tagacaaagc cattgcacag 1020 

gttgatactt tccgctcctc cctcggtgcc gttcaaaacc gtctggactc tgcggtaacc 1080 

aacctgaaca acaccaccac caacctgtct gaagcgcagt cccgtattca ggacgccgac 1140 

tatgcgaccg aagtgtctaa catgtcgaaa gcgcagatca tccagcaggc gggtaactct 1200 

gtgctgtcta aa 1212 



<210> 55 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



<400> 55 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcggg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcacagac cactgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtatccgtg agctgacggt tcaggcttct 3 00 
accgggacta actctgattc ggatctggac tccattcagg acgaaatcaa atcccgtctc 3 60 
gacgaaattg accgcgtatc cggtcagacc cagttcaacg gcgtgaacgt actggcaaaa 420 
gacggttcga tgaaaattca ggttggtgcg aatgacggtg aaactatcac tatcgacctg 480 
aagaaaatcg attctgatac tctgggtctg aatggtttta acgtaaatgg taaaggtact 540 
attaccaaca aagctgcaac ggtaagtgat ttaacttctg ctggcgcgaa gttaaacacc 600 
acgacaggtc tttatgatct gaaaaccgaa aataccttgt taactaccga tgctgcattc 66 0 
gataaattag ggaatggcga taaagtcacc gttggcggcg tagattatac ttacaacgct 720 
aaatctggtg attttactac caccaaatct actgctggta cgggtgtaga cgccgcggcg 780 
caggctactg attcagctaa aaaacgtgat gcgttagctg ccacccttca tgctgatgtg 840 
ggtaaatctg ttaatggttc ttacaccaca aaagatggta ctgtttcttt cgaaacggat 900 
tcagcaggta atatcaccat cggtggaagc caggcatacg tagacgatgc aggcaacttg 960 
acgactaaca acgctggtag cgcagctaaa gctgatatga aagcgctgct taaagccgcg 102 0 
agcgaaggta gtgacggtgc ctctctgaca ttcaatggca ctgaatatac tatcgcaaaa 1080 
gcaactcctg cgacaacctc tccagtagct ccgttaatcc ctggtgggat tacttatcag 1140 
gctacagtga gtaaagatgt agtattgagc • gaaaccaaag cggctgccgc gacatcttca 1200 
attaccttta attccggtgt actgagcaaa actattgggt ttaccgcggg tgaatccagt 1260 
gatgctgcga agtcttatgt ggatgataaa ggtggtatta ctaacgttgc cgactataca 132 0 
gtctcttaca gcgttaacaa ggataacggc tctgtgactg ttgccgggta tgcttcagcg 13 80 
actgatacca ataaagatta tgctccagca attggtactg ctgtaaatgt gaactccgcg 1440 
ggtaaaatca ctjactgagac taccagtgct ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ctggacgacg ctatcagctc catcgacaaa ttccgttctt ccctgggtgc tatccagaac 1560 
cgtctggatt ccgcagtcac caacctgaac aacaccacta ccaacctgtc tgaagcgcag 162 0 
tcccgtattc aggacgccga ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatt 1680 
atccagcagg ccggtaactc cgtgctggca aaagccaacc aggtaccgca gcaggttctg 1740 
tctctgctgc agggttaa 1758 



SUBSTITUTE SHEET (Riile 26) (RO/AU) 




wo 99/61458 

<210> 56 
<211> 14024 
<212> DNA 

<213> Escherichia coli 
<400> 56 

gtaaccaagg gcggtacgtg cataaatttt 
tatataagaa attctcaaat gaacaaagaa 
ggggccaaaa ctataatctc atcagtagaa 
gttttgtata tcattgacga ttgtagcacc 
tacaaaaaca atcagaaaat aagaatattg 
agtcgaaatt atggaataga aatggccacg 
gatttgtggc acgagaaaaa attagagcgt 
gatgtggtat gttctaatta ttatgttata 
aatgctcctc atgtgataaa ttatagaaaa 
acaggaatct ataatgccaa caaattgggt 
gattatttga tgtggctgga aataattaat 
aatctggcgt attacatgcg ttcaaataat 
aaatggacat ggagtatata tagagaacat 
tattttttat tatatgcttc aaatggagtc 
agaaaggaga ctaaaaagtg aagtcagcgg 
tttatagtct ccagttgtat ggggttatca 
aggtattaac tagtattata attatatttc 
cgattataaa tgaaagaaaa cagcagaaaa 
tactcgtttt cctttttgtg actatagaaa 
gtattcctat atttgatgat gatccagggg 
gactttacat tagatatatt aagtattttg 
tttatgatga gcataaattc aaacagagga 
ctttatttgg ttatcgttct gaattggtgt 
atatcctgtc aaaggataac cgtaatccta 
tggtaggggt tgtatgctcg ttgttttatc 
actcatataa taatatgtta aggataatta 
ttccatatgt tgtttctgaa tctattaaga 
aggaattaaa agcaataata aatagaatac 
gagaacggtt acataaacaa gtatttggag 
cgtatggagc agaactgtta gttttttttg 
ggatatatat acctttttat cttttaaaga 
gcgcattcta ttcatatatc attatgattt 
cggccttctt ttttggtcct tttctctccg 
tgcatgatac gttaaagaga ttatcacgaa 
aataatgctg aagggttaga aaaaacttta 
tttgagatta ttatagttga tggcggctct 
tttactagta tgaatattac acatgtttat 
aataagggcc gaatgttggc caaaggcgac 
gtaattggag atatatataa aaatatcaaa 
gaaaatgata aacttctggg attttcttct 
caaggggtga ttttcccaaa gaatcattca 
gattataagc ttattcaaga ggtgtttcct 
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aatgcttatc aaaactatta gcattaaaaa 60 
accgtttcaa taattatgcc cgtttacaat 12 0 
tcaattatac atcaatctta tcaagatttt 180 
gatgatacat tttcattaat caacagtcga 240 
cgtaacaaga caaatttagg tgttgcagaa 300 
gggaaatata tttctttttg tgatgcggat 3 60 
caaatcgaag tgttaaataa tgaatgtgta 420 
gataacaata gaaatattgt tggcgaagtt 480 
atgctcatga aaaactacat agggaatttg 540 
aagttttatc aaaaaaagat tggtcacgag 600 
aaaacaaatg gtgctatttg tattcaagat 660 
tcactatcgg gtaataaaat taaagctgca 720 
ttacatttgt cctttccaaa aacattatat 780 
atgaaaaaaa taacacattc actattaagg 840 
ctaagttgat ttttttattc ctatttacac 900 
tagatgatcg tataacaaat tttgatacaa 960 
agattttttt tgttttatta ttttatctaa 1020 
aatttatcgt gaactgggag ctaaagttaa 1080 
ttgctgctgt agttttattt cttaaagaag 1140 
gggctaaact tagaatagct gaaggtaatg 1200 
gtaatatagt tgtgtttgca ttaattattc 1260 
ccatcatatt tgtatatttt acaacgattg 1320 
tgctcattct tcaatatata ttgattacca 13 80 
aaataaaaag aataataggg tattttttat 1440 
taagtttagg acaagacgga gaacaaaatg 1500 
ataggttaac aatagagcaa gttgaaggtg 1560 
acgatttctt tccgacacca gagttagaaa 1620 
agggaataaa gcatcaagac ttattttatg 1680 
acatgggagc aaatttttta tcagttacta 1740 
gttttctctg tgtattcatt atccctttag 1800 
gaatgaaaaa aacccatagc tcgataaatt 1860 
tattgcaata cttagtggct gggaatgcat 1920 
tattgataat gtgtactcct ctgatcttat 1980 
atgaaaatat cagttataac tgtgacttat 2040 
agtagtttat caattttaaa aataaaacct 2100 
acagatggaa cgaatcgtgt cattagtaga 2160 
gaaaaagatg aagggatata tgatgcgatg 2220 
ttaatacatt atttaaacgc cggcgatagc 228 0 
gagccatgtt tgattaaagt tggccttttc 2340 
ataacccatt caaatacagg gtattgtcat 2400 
gaatatgatc taaggtataa aatatgtgct 2460 
gaagggttaa gatctctatc tttgattact 2520 
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tcgggttatg taaaatatga tatgggggga gtatcttcaa aaaaaagaat tttaagagat 2580 
aaagagcttg ccaaaattat gtttgaaaaa aataaaaaaa accttattaa gtttattcca 2640 
atttcaataa tcaaaatttt attccctgaa cgtttaagaa gagtattgcg gaaaatgcaa 2700 
tatatttgtc taactttatt cttcatgaag aatagttcac catatgataa tgaataaaat 2760 
caaaaaaata cttaaatttt gcactttaaa aaaatatgat acatcaagtg ctttaggtag 2 82 0 
agaacaggaa aggtacagga ttatatcctt gtctgttatt tcaagtttga ttagtaaaat 2880 
actctcacta ctttctctta tattaactgt aagtttaact ttaccttatt taggacaaga 2940 
gagatttggt gtatggatga ctattaccag tcttggtgct gctctgacat ttttggactt 3000 
aggtatagga aatgcattaa caaacaggat cgcacattca tttgcgtgtg gcaaaaattt 3 060 
aaagatgagt cggcaaatta gtggtgggct cactttgctg gctggattat cgtttgtcat 3120 
aactgcaata tgctatatta cttctggcat gattgattgg caactagtaa taaaaggtat 3180 
aaacgagaat gtgtatgcag agttacaaca ctcaattaaa gtctttgtaa tcatatttgg 3240 
acttggaatt tattcaaatg gtgtgcaaaa agtttatatg ggaatacaaa aagcctatat 33 00 
aagtaatatt gttaatgcca tatttatatt gttatctatt attactctag taatatcgtc 33 60 
gaaactacat gcgggactac cagttttaat tgtcagcact cttggtattc aatacatatc 3420 
gggaatctat ttaacaatta atcttattat aaagcgatta ataaagttta caaaagttaa 3480 
catacatgct aaaagagaag ctccatattt gatattaaac ggttttttct tttttatttt 3540 
acagttaggc actctggcaa catggagtgg tgataacttt ataatatcta taacattggg 3 600 
tgttacttat gttgctgttt ttagcattac acagagatta tttcaaatat ctacggtccc 3660 
tcttacgatt tataacatcc cgttatgggc tgcttatgca gatgctcatg cacgcaatga 3720 
tactcaattt ataaaaaaga cgctcagaac atcattgaaa atagtgggta tttcatcatt 37 80 
cttattggcc ttcatattag tagtgttcgg tagtgaagtc gttaatattt ggacagaagg 3840 
aaagattcag gtacctcgaa cattcataat agcttatgct ttatggtctg ttattgatgc 3900 
tttttcgaat acatttgcaa gctttttaaa tggtttgaac atagttaaac aacaaatgct 3960 
tgctgttgta acattgatat tgatcgcaat tccagcaaaa tacatcatag ttagccattt 4020 
tgggttaact gttatgttgt actgcttcat ttttatatat attgtaaatt actttatatg 4080 
gtataaatgt agttttaaaa aacatatcga tagacagtta aatataagag gatgaaaatg 4140 
aaatatatac cagtttacca accgtcattg acaggaaaag aaaaagaata tgtaaatgaa 42 00 
tgtctggact caacgtggat ttcatcaaaa ggaaactata ttcagaagtt tgaaaataaa 42 60 
tttgcggaac aaaaccatgt gcaatatgca actactgtaa gtaatggaac ggttgctctt 432 0 
catttagctt tgttagcgtt aggtatatcg gaaggagatg aagttattgt tccaacactg 43 80 
acatatatag catcagttaa tgctataaaa tacacaggag ccacccccat tttcgttgat 4440 
tcagataatg aaacttggca aatgtctgtt agtgacatag aacaaaaaat cactaataaa 4500 
actaaagcta ttatgtgtgt ccatttatac ggacatccat gtgatatgga acaaattgta 4560 
gaactggcca aaagtagaaa tttgtttgta attgaagatt gcgctgaagc ctttggttct 4620 
aaatataaag gtaaatatgt gggaacattt ggagatattt ctacttttag cttttttgga 4680 
aataaaacta ttactacagg tgaaggtgga atggttgtca cgaatgacaa aacactttat 4740 
gaccgttgtt tacattttaa aggccaagga ttagctgtac ataggcaata ttggcatgac 4800 
gttataggct acaattatag gatgacaaat atctgcgctg ctataggatt agcccagtta 4860 
gaacaagctg atgattttat atcacgaaaa cgtgaaattg ctgatattta taaaaaaaat 492 0 
atcaacagtc ttgtacaagt ccacaaggaa agtaaagatg tttttcacac ttattggatg 49 80 
gtctcaattc taactaggac cgcagaggaa agagaggaat taaggaatca ccttgcagat 5040 
aaactcatcg aaacaaggcc agttttttac cctgtccaca cgatgccaat gtactcggaa 5100 
aaatatcaaa agcaccctat agctgaggat cttggttggc gtggaattaa tttacctagt 5160 
ttccccagcc tatcgaatga gcaagttatt tatatttgtg aatctattaa cgaattttat 5220 
agtgataaat agcctaaaat attgtaaagg tcattcatga aaattgcgtt gaattcagat 52 80 
ggattttacg agtggggcgg tggaattgat tttattaaat atattctgtc aatattagaa 53 4 0 
acgaaaccag aaatatgtat cgatattctt ttaccgagaa atgatataca ttctcttata 5400 
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agagaaaaag catttccttt taaaagtata ttaaaagcaa ttttaaagag ggaaaggcct 5460 
cgatggattt cattaaatag atttaatgag caatactata gagatgcctt tacacaaaat 5520 
aatatagaga cgaatcttac ctttattaaa agtaagagct ctgcctttta ttcatatttt 5580 
gatagtagcg attgtgatgt tattcttcct tgcatgcgtg ttccttcggg aaatttgaat 5640 
aaaaaagcat ggattggtta tatttatgac tttcaacact gttactatcc ttcatttttt 5700 
agtaagcgag aaatagatca aaggaatgtg ttttttaaat tgatgctcaa ttgcgctaac 57 60 
aatattattg ttaatgcaca ttcagttatt accgatgcaa ataaatatgt tgggaattat 5820 
tctgcaaaac tacattctct tccatttagt ccatgccctc aattaaaatg gttcgctgat 5880 
tactctggta atattgccaa atataatatt gacaaggatt attttataat ttgcaatcaa 5940 
ttttggaaac ataaagatca tgcaactgct tttagggcat ttaaaattta tactgaatat 6000 
aatcctgatg tttatttagt atgcacggga gctactcaag attatcgatt ccctggatat 6060 
tttaatgaat tgatggtttt ggcaaaaaag ctcggaattg aatcgaaaat taagatatta 6120 
gggcatatac ctaaacttga acaaattgaa ttaatcaaaa attgcattgc tgtaatacaa 6180 
ccaaccttat ttgaaggcgg gcctggaggg ggggtaacat ttgacgctat tgcattaggg 6240 
aaaaaagtta tactatctga catagatgtc aataaagaag ttaattgcgg tgatgtatat 6300 
ttctttcagg caaaaaacca ttattcatta aatgacgcga tggtaaaagc tgatgaatct 63 60 
aaaatttttt atgaacctac aactctgata gaattgggtc tcaaaagacg caatgcgtgt 642 0 
gcagattttc ttttagatgt tgtgaaacaa gaaattgaat cccgatctta atatattcaa 6480 
gaggtatata atgactaaag tcgctcttat tacaggtgta actggacaag atggatctta 654 0 
tctagctgag tttttgcttg ataaagggta tgaagttcat ggtatcaaac gccgagcctc 6600 
atcttttaat acagaacgca tagaccatat ttatcaagat ccacatggtt ctaacccaaa 6660 
ttttcacttg cactatggag atctgactga ttcatctaac ctcactagaa ttctaaagga 6720 
ggtacagcca gatgaagtat ataatttagc tgctatgagt cacgtagcag tttcttttga 6780 
gtctccagaa tatacagccg atgtcgatgc aattggtaca ttacgtttac tggaagcaat 684 0 
tcgcttttta ggattggaaa acaaaacgcg tttctatcaa gcttcaacct cagaattata 6900 
tggacttgtt caggaaatcc ctcaaaaaga atccacccct ttttatcctc gttcccctta 69 60 
tgcagttgca aaactttacg catattggat cacggtaaat tatcgagagt catatggtat 702 0 
ttatgcatgt aatggtatat tgttcaatca tgaatctcca cgccgtggag aaacgtttgt 7080 
aacaaggaaa attactcgag gacttgcaaa tattgcacaa ggcttggaat catgtttgta 7140 
tttagggaat atggattcgt tacgagattg gggacatgca aaagattatg ttagaatgca 7200 
atggttgatg ttacaacagg agcaacccga agattttgtg attgcaacag gagtccaata 7260 
ctcagtccgt cagtttgtcg aaatggcagc agcacaactt ggtattaaga tgagctttgt 732 0 
tggtaaagga atcgaagaaa aaggcattgt agattcggtt gaaggacagg atgctccagg 73 80 
tgtgaaacca ggtgatgtca ttgttgctgt tgatcctcgt tatttccgac cagctgaagt 7440 
tgatactttg cttggagatc cgagcaaagc taatctcaaa cttggttgga gaccagaaat 7500 
tactcttgct gaaatgattt ctgaaatggt tgccaaagat cttgaagccg ctaaaaaaca 7560 
ttctctttta aaatcgcatg gtttttctgt aagcttagct ctggaatgat gatgaataag 7620 
caacgtattt ttattgctgg tcaccaagga atggttggat cagctattac ccgacgcctc 7680 
aaacaacgtg atgatgttga gttggtttta cgtactcggg atgaattgaa cttgttggat 7740 
agtagcgctg ttttggattt tttttcttca cagaaaatcg accaggttta tttggcagca 7800 
gcaaaagtcg gaggtatttt agctaacagt tcttatcctg ccgattttat atatgagaat 7860 
ataatgatag aggcgaatgt cattcatgct gccdacaaaa ataatgtaaa taaactgctt 7920 
ttcctcggtt cgtcgtgtat ttatcctaag ttagcacacc aaccgattat ggaagacgaa 7980 
ttattacaag ggaaacttga gccaacaaat gaaccttatg ctatcgcaaa aattgcaggt 8040 
attaaattat gtgaatctta taaccgtcag tttgggcgtg attaccgttc agtaatgcca 8100 
accaatcttt atggtccaaa tgacaatttt catccaagta attctcatgt gattccggcg 8160 
cttttgcgcc gctttcatga tgctgtggaa aacaattctc cgaatgttgt tgtttgggga 8220 
agtggtactc caaagcgtga attcttacat gtagatgata tggcttctgc aagcatttat 8280 
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gtcatggaga tgccatacga tatatggcaa 
aatattggaa caggtattga ctgcacgatt 
gtaggttata aagggcatat tacgttcgat 
ctacttgatg taacgcttct tcatcaacta 
ggtcttgaaa atacatacaa ctggtttctt 
tgtttttaca ttcccaagac tttgccacaa 
atttgattgt ggaaaacgag tttggcgaaa 
cacagggcta ttggttcgtt cctggtggta 
cctttgaacg attgacagaa attgaactag 
tttatggtat ctggcagcac ttctacgaag 
attatatagt tatagcattc cttcttaaat 
cacaacataa tgcttattgc tggctatcgc 
attataattg tcgcgcatat tttaacaata 
aggatataat atgtctgatg cgccaataat 
tcgtctttgg ccactttctc gtgaactata 
taacaccttg ttacaaacga ctttgctacg 
agtgataaca aatgaacagc atcgctttgt 
attaaatggt aatattattc tagaaccatg 
atctgcgttt catgcgttaa aacgtaatcc 
ggcagaccac gttatagcta aagaaagtgt 
catcgctaat caaggtaaaa ttgtaacgtt 
ttatgggtat attgagagag gtgaactatc 
tttttattat gtaaataagt ttgtcgaaaa 
gacttctggt aatcactatt ggaatagtgg 
tgaggaattg agaaaattta gacctgacat 
ctcatacatt gatctagatt ttattcgatt 
tgaatctatt gattttgctg taatggaaaa 
tattggttgg agtgacgttg gatcttggca 
aacaggagat gtatgtaaag gtgatatatt 
ctctgagtca gcgttggtag ccgccattgg 
agatgccgtt cttgtgtcta aaaagagtga 
gcttaaattg cagcaacgta cagagtatat 
aaaatttgat tcgattgacc aaggtgagcg 
tggtgagggg ctttctttaa ggatgcatca 
tggtacagca aaagtaaccc ttggcgataa 
atacattccc cttggcgcag cgtatagtct 
tattgaagtc agttcagggg attatttggg 
ttacaaacat gaagattaac atatgaaatc 
cgggaaatta ggcgaagaac tgaatgaaga 
cgaatttctc aaaccgaaaa ccattgtttt 
gttaaaactg gcgcttgcga aaggtttaca 
tatgtdcggc accgaagaga tctatttcgc 
cgaagttacc gccagccata acccgatgga 
ggctcgcccg atcagcggtg ataccggact 
tgacttccct cctgtcgatg aaaccaaacg 
cgcttacgtt gatcacctgt tcggttatat 
ggtgatcaac tccgggaacg gcgcagcggg 
taaagccctc ggcgcaccgg tggaattaat 




PCT/AU99/00385 

- 45 - 

aaaaatacta aagtaatgtt gtctcatatc 83 40 
tgtgagcttg cggaaacaat agcaaaagtt 8400 
acaacaaagc ccgatggagc ccctcgaaaa 8460 
ggttggaatc ataaaattac ccttcacaag 8520 
gaaaaccaac ttcaatatcg ggggtaataa 8580 
ttgtaaggtc tactcctctt atttctatag 8640 
ttttgctagg aaaacgaatc aaccgcccgg 87 00 
gggtgttgaa agatgaaaaa ttgcagacag 8760 
gaattcgttt gcctctctct gtgggtaagt 8820 
acaatagtat ggggggagac ttttcaacgc 8880 
tacaaccaaa cattttgaaa ttaccgaagt 8940 
gagcaaagct gataaatgat gacgatgtgc 90 00 
aaacaaatga tgcgattggc ttagataata 9060 
tgctgtagtt atggccggtg gtacaggcag 9120 
tccaaagcag tttttacaac tctctggtga 9180 
actttcaggc ctatcatgtc aaaaaccatt 9240 
tgtggctgaa cagttaaggg aaataaataa 93 00 
cgggcgaaat actgcaccag caatagcgat 93 60 
tcaggaagat ccattgcttc tagttcttgc 9420 
tttctgtgat gctattaaaa atgcaactcc 9480 
tggaattata ccagaatatg ctgaaactgg 9540 
tgtaccgctt caagggcatg aaaatactgg 9600 
gcctaatcgt gaaaccgcag aattgtatat 9660 
aatattcatg tttaaggcat ctgtttatct 972 0 
ttacaatgtt tgtgaacagg ttgcctcatc 9780 
atcaaaagaa caatttcaag attgtcctgc 9840 
aacagaaaaa tgtgttgtat gccctgttga 9900 
atcgttatgg gacattagtc taaaatcgaa 9960 
aacctatgat actaagaata attatatcta 10020 
aattgaagat atggttatcg tgcaaactaa 10080 
tgtacagcat gtaaaaaaaa tagtcgaaat 10140 
tagtcatcgt gaagttttcc gaccatgggg 10200 
atacaaagtc aagaaaatta ttgtgaaacc 102 60 
ccatcgttct gaacattgga tcgtgctttc 10320 
aactaaacta gtcaccgcaa atgaatcgat 103 80 
tgagaatccg ggcataatcc ctcttaatct 10440 
agaggatgat attataagac agaaagaacg 10500 
tttaacctgc tttaaagcct atgatattcg 10560 
tattgcctgg cgcattgggc gtgcctatgg 10620 
aggcggtgat gtccgcctca ccagcgaagc 10680 
ggatgcgggc gtcgatgtgc tggatatcgg 10740 
cacgttccat ctcggagtgg atggcggcat 10800 
ttacaacggc atgaagctgg tgcgcgaagg 10860 
gcgcgatgtc cagcgtctgg cagaagccaa 10920 
tggtcgctat cagcaaatca atctgcgtga 10980 
caacgtcaaa aacctcacgc cgctcaagct 11040 
tccggtggtg gacgccattg aagcccgatt 11100 
caaagtacac aacacgccgg acggcaattt 11160 
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ccccaacggt attcctaacc cgctgctgcc ggaatgccgc gacgacaccc gtaatgcggt 11220 
catcaaacac ggcgcggata tgggcattgc ctttgatggc gattttgacc gctgtttcct 11280 
gtttgacgaa aaagggcagt ttatcgaggg ctactacatt gtcggcctgc tggcagaagc 11340 
gttcctcgaa aaaaatcccg gcgcgaagat catccacgat ccacgtctct cctggaacac 11400 
cgttgatgtg gtgactgccg caggcggcac cccggtaatg tcgaaaaccg gacacgcctt 11460 
tattaaagaa cgtatgcgca aggaagacgc catctacggt ggcgaaatga gcgctcacca 11520 
ttacttccgt gatttcgctt actgcgacag cggcatgatc ccgtggctgc tggtcgccga 11580 
actggtgtgc ctgaaaggaa aaacgctggg cgaaatggtg cgcgaccgga tggcggcgtt 11640 
tccggcaagc ggtgagatca acagcaaact ggcgcaaccc gttgaggcaa ttaatcgcgt 11700 
ggaacagcat tttagccgcg aggcgctggc ggtggatcgc accgatggca tcagcatgac 11760 
ctttgccgac tggcgcttta acctgcgctc ctccaacacc gaaccggtgg tgcggttgaa 11820 
tgtggaatca cgcggtgatg taaagctaat ggaaaagaaa actaaagctc ttcttaaatt 11880 
gctaagtgag tgattattta cattaatcat taagcgtatt taagattata ttaaagtaat 11940 
gttattgcgg tatatgatga atatgtgggc ttttttatgt ataacgacta taccgcaact 12000 
ttatctagga aaagattaat agaaataaag ttttgtactg accaatttgc atttcacgtc 12 060 
acgattgaga cgttcctttg cttaagacat tttttcatcg cttatgtaat aacaaatgtg 12120 
ccttatataa aaaggagaac aaaatggaac ttaaaataat tgagacaata gatttttatt 12180 
atccctgttt acgatattat agccaaagtt gtatcctgca tcagtcctgc aatatttcac 12240 
gagtgctttg ttaactgaat acatgtctgc cattttccag atgataacga cgtcatcgca 12300 
attgatggta aaacacttcg gcacacttat gacaagagtc gtcgcagagg agtggttcat 12360 
gtcattagtg cgtttcagca atgcacagtc tggtcctcgg atagatcaag acggatgaga 12420 
aacctaatgc gttcacagtt attcatgaac tttctaaaat gatgggtatt aaaggaaaaa 12480 
taatcataac tgatgcgatg gcttgccaga aagatattgc agagaagata taaaaacaga 12540 
gatgtgatta tttattcgct gtaaaaggaa ataagagtcg gcttaataga gtctttgagg 12600 
agatatttac gctgaaagaa ttaaataatc caaaacatga cagttacgca attagtgaaa 12660 
agaggcacgg cagagacgat gtccgtcttc atattgtttg agatgctcct gatgagctta 12720 
ttgatttcac gtttgaatgg aaagggctgc agaatttatg aatggcagtc cactttctct 12780 
caataatagc agagcaaaag aaagaatccg aaatgacgat caaatattat attagatctg 12 840 
ctgctttaac cgcagagaag ttcgccacag taaatcgaaa tcactggcgc atggagaata 12900 
agttgcacag tagcctgatg tggtaatgaa tgaaatcgac tataatataa gaaggcgagt 12960 
tgcattcgaa tgattttcta gaatgcggca catcgctatt aatatctgac aatgataatg 13 020 
tattcaaggc aggattatca tgtaagatgc gaaaagcagt catggacaga aacttcctag 13 080 
cgtcaggcat tgcagcgtgc gggctttcat aatcttgcat tggttttgat aagatatttc 13140 
tttggagatg ggaaaatgaa tttgtatggt atttttggtg ctggaagtta tggtagagaa 13200 
acaataccca ttctaaatca acaaataaag caagaatgtg gttctgacta tgctctggtt 13260 
tttgtggatg atgttttggc aggaaagaaa gttaatggtt ttgaagtgct ttcaaccaac 13 320 
tgctttctaa aagcccctta tttaaaaaag tattttaatg ttgctattgc taatgataag 133 80 
atacgacaga gagtgtctga gtcaatatta ttacacgggg ttgaaccaat aactataaaa 13440 
catccaaata gcgttgttta tgatcatact atgataggta gtggcgctat tatttctccc 13 500 
tttgttacaa tatctactaa tactcatata gggaggtttt ttcatgcaaa catatactca 13 560 
tacgttgcac atgattgtca aataggagac tatgttacat ttgctcctgg ggctaaatgt 13 620 
aatggatatg ttgttattga agacaatgca tatataggct cgggtgcagt aattaagcag 13 680 
ggtgttccta atcgcccact tattattggc gcgggagcca ttataggtat gggggctgtt 13740 
gtcactaaaa gtgttcctgc cggtataact gtgtgcggaa atccagcaag agaaatgaaa 13 800 
agatcgccaa catctattta atgggaatgc gaaaacacgt tccaaatggg actaatgttt 13 860 
aaaatatata taatttcgct aatttactaa attatggctt ctttttaagc tatcctttac 13920 
ttagttatta ctgatacagc atgaaattta taatactctg atacattttt atacgttatt 13980 
caagccgcat atctagcggt aacccctgac aggagtaaac aatg 14024 
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<210> 57 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



<400> 57 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg cggcccgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
acagggacta actccgattc tgacctggac 
gatgaaattg accgcgtatc cggccagacc 
gacggttcaa tgaaaattca ggttggtgcg 
aaaaaaatcg attctgatac tctgggtctg 
attaccaaca aagctgcaac ggtaagtgat 
acgacaggtc tttatgatct gaaaaccgaa 
gataaattag ggaatggcga taaagtcaca 
aaatctggtg attttactac cactaaatct 
caggctgctg attcagcttc aaaacgtgat 
ggtaaatctg ttaatggttc ttacaccaca 
tcagcaggta atatcaccat cggtggaagc 
acgactaaca acgctggtag cgcagctaaa 
agcgaaggta gtgacggtgc ctctctgaca 
gcaactcctg cgacaaccac tccagtagct 
gctacagtga gtaaagatgt agtattgagc 
attaccttta attccggtgt actgagcaaa 
gatgctgcga agtcttatgt ggatgataaa 
gtctcttaca gcgttaacaa ggataacggc 
actgatacca ataaagatta tgctccagca 
ggtaaaatca ctactgagac taccagtgct 
ctggacgacg caatcagctc catcgacaaa 
cgtctggatt ccgcagtcac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
attcagcagg ccggtaactc cgtgctggca 
tctctgctgc agggttaa 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcgcagac caccgaaggc 240 
cgtattcgtg aactgacggt tcaggccact 300 
tccatccagg acgaaatcaa atctcgtctt 3 60 
cagttcaacg gcgtgaacgt gctggcgaaa 420 
aatgacggcg aaaccatcac gatcgacctg 480 
aatggcttta acgtaaatgg taaaggtact 540 
ttaacttctg ctggcgcgaa gttaaacacc 6 00 
aataccttgt taactaccga tgctgcattc 660 
gttggcggcg tagattatac ttacaacgct 720 
actgctggta cgggtgtaga cgccgcggcg 780 
gcgttagctg ccacccttca tgctgatgtg 840 
aaagatggta ctgtttcttt cgaaacggat 900 
caggcatacg tagacgatgc aggcaacttg 960 
gctgatatga aagcgctgct caaagcagcg 1020 
ttcaatggca cagaatatac catcgcaaaa 1080 
ccgttaatcc ctggtgggat tacttatcag 1140 
gaaaccaaag cggctgccgc gacatcttca 1200 
actattgggt ttaccgcggg tgaatccagt 1260 
ggtggtatca ctaacgttgc cgactataca 1320 
tctgtgactg ttgccgggta tgcttcagcg 13 80 
attggtactg ctgtaaatgt gaactccgcg 1440 
ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ttccgttctt ccctgggtgc tatccagaac 1560 
aacaccacta ccaacctgtc cgaagcgcag 1620 
gaagtgtcca acatgtcgaa agcgcagatc 1680 
aaagctaacc aggtaccgca gcaggttctg 1740 

1758 



<210> 58 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



<400> 58 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgcagcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcgcagac caccgaaggc 240 
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gcgctgtccg aaatcaacaa caacttacag cgtattcgtg aactgacggt tcaggccact 3 00 

acagggacta actccgattc tgacctggac tccatccagg acgaaatcaa atctcgtctt 3 60 

gatgaaattg accgcgtatc cggccagacc cagttcaacg gcgtgaacgt gctggcgaaa 42 0 

gacggttcaa tgaaaattca ggttggtgcg aatgacggcg aaaccatcac gatcgacctg 480 

aaaaaaatcg attctgatac tctgggtctg aatggcttta acgtaaatgg taaaggtact 540 

attaccaaca aagctgcaac ggtaagtgat ttaacttctg ctggcgcgaa gttaaacacc 600 

acgacaggtc tttatgatct gaaaaccgaa aataccttgt taactaccga tgctgcattc 660 

gataaattag ggaatggcga taaagtcaca gttggcggcg tagattatac ttacaacgct 720 

aaatctggtg attttactac cactaaatct actgctggta cgggtgtaaa cgccgcggcg 780 

caggctgctg attcagcttc aaaacgtgat gcgttagctg ccacccttca tgctgatgtg 840 

ggtaaatctg ttaatggttc ttacaccaca aaagatggta ctgtttcttt cgaaacggat 900 

tcagcaggta atatcaccat cggtggaagc caggcatacg tagacgatgc aggcaacttg 960 

acgactaaca acgctggtag cgcagctaaa gctgatatga aagcgctgct caaagcagcg 1020 

agcgaaggta gtgacggtgc ctctctgaca ttcaatggca cagaatatac catcgcaaaa 1080 

gcaactcctg cgacaaccac tccagtagct ccgttaatcc ctggtgggat tacttatcag 1140 

gctacagtga gtaaagatgt agtattgagc gaaaccaaag cggctgccgc gacatcttca 12 00 

attaccttta attccggtgt actgagcaaa actattgggt ttaccgcggg tgaatccagt 12 60 

gatgctgcga agtcttatgt ggatgataaa ggtggtatca ctaacgttgc cgactataca 1320 

gtctcttaca gcgttaacaa ggataacggc tctgtgactg ttgccgggta tgcttcagcg 13 80 

actgatacca ataaagatta tgctccagca attggcactg ctgtaaatgt gaactccgcg 1440 

ggtaaaatca ctactgagac taccagtgct ggttctgcaa cgaccaaccc gcttgctgcc 1500 

ctggacgacg caatcagctc catcgacaaa ttccgttctt ccctgggtgc tatccagaac 1560 

cgtctggatt ccgcggtcac caacctgaac aacaccacta ccaacctgtc cgaagcgcag 1620 

tcccgtattc aggacgccga ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatc 1680 

atccagcagg ccggtaactc cgtgctggca aaagctaacc aggtaccgca gcaggttctg 1740 
tctctgctgc agggttaa 1758 

<210> 59 
<211> 1758 
<212> DNA 

<213> Escherichia coli 
<400> 59 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 

aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 

gcgaaggatg acgccgcggg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 

ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcacagac cactgaaggc 240 

gcgctgtccg aaatcaacaa caacttacag cgtatccgtg agctgacggt tcaggcttct 300 

accgggacta actctgattc ggatctggac tccattcagg acgaaatcaa atcccgtctc 3 60 

gacgaaattg accgcgtatc cggtcagacc cagttcaacg gcgtgaacgt actggcaaaa 420 

gacggttcga tgaaaattca ggttggtgcg aatgacggtg aaactatcac tatcgacctg 480 

aagaaaatcg attctgatac tctgggtctg aatggtttta acgtaaatgg taaaggtact 540 

attaccaaca aagctgcaac ggtaagtgat ttaacttctg ctggcgcgaa gttaaacacc 600 

acgacaggtc tttatgatct gaaaaccgaa aataccttgt taactaccga tgctgcattc 660 

gataaattag ggaatggcga taaagtcacc gttggcggcg tagattatac ttacaacgct 720 

aaatctggtg attttactac caccaaatct actgctggta cgggtgtaga cgccgcggcg 780 

caggctactg attcagctaa aaaacgtgat gcgttagctg ccacccttca tgctgatgtg 840 

ggtaaatctg ttaatggttc ttacaccaca aaagatggta ctgtttcttt cgaaacggat 900 
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tcagcaggta atatcaccat cggtggaagc caggcatacg tagacgatgc aggcaacttg 960 
acgactaaca acgctggtag cgcagctaaa gctgatatga aagcgctgct taaagccgcg 1020 
agcgaaggta gtgacggtgc ctctctgaca ttcaatggca ctgaatatac tatcgcaaaa 1080 
gcaactcctg cgacaacctc tccagtagct ccgttaatcc ctggtgggat tacttatcag 1140 
gctacagtga gtaaagatgt agtattgagc gaaaccaaag cggctgccgc gacatcttca 1200 
attaccttta attccggtgt actgagcaaa actattgggt ttaccgcggg tgaatccagt 1260 
gatgctgcga agtcttatgt ggatgataaa ggtggtatta ctaacgttgc cgactataca 1320 
gtctcttaca gcgttaacaa ggataacggc tctgtgactg ttgccgggta tgcttcagcg 13 80 
actgatacca ataaagatta tgctccagca attggtactg ctgtaaatgt gaactccgcg 1440 
ggtaaaatca ctactgagac taccagtgct ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ctggacgacg ctatcagctc catcgacaaa ttccgttctt ccctgggtgc tatccagaac 1560 
cgtctggatt ccgcagtcac caacctgaac aacaccacta ccaacctgtc tgaagcgcag 162 0 
tcccgtattc aggacgccga ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatt 1680 
atccagcagg ccggtaactc cgtgctggca aaagccaacc aggtaccgca gcaggttctg 1740 
tctctgctgc agggttaa 1758 

<210> 60 
<211> 1758 
<212> DNA 

<213> Escherichia coli 
<400> 60 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 

aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 

gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 

ctgactcagg cggcccgtaa cgccaacgac ggtatttctg ttgcgcagac caccgaaggc 240 

gcgctgtccg aaatcaacaa caacttacag cgtattcgtg aactgacggt tcaggccact 300 

acagggacta actccgattc tgacctggac tccatccagg acgaaatcaa atctcgtctt 3 60 

gatgaaattg accgcgtatc cggccagacc cagttcaacg gcgtgaacgt gctggcgaaa 420 

gacggttcaa tgaaaattca ggttggtgcg aatgacggcg aaaccatcac gatcgacctg 480 

aaaaaaatcg attctgatac tctgggtctg aatggcttta acgtaaatgg taaaggtact 540 

attaccaaca aagctgcaac ggtaagtgat ttaacttctg ctggcgcgaa gttaaacacc 600 

acgacaggtc tttatgatct gaaaaccgaa aataccttgt taactaccga tgctgcattc 660 

gataaattag ggaatggcga taaagtcaca gttggcggcg tagattatac ttacaacgct 720 

aaatctggtg attttactac cactaaatct actgctggta cgggtgtaga cgccgcggcg 780 

caggctgctg attcagcttc aaaacgtgat gcgttagctg ccacccttca tgctgatgtg 840 

ggtaaatctg ttaatggttc ttacaccaca aaagatggta ctgtttcttt cgaaacggat 900 

tcagcaggta atatcaccat cggtggaagc caggcatacg tagacgatgc aggcaacttg 960 

acgactaaca acgctggtag cgcagctaaa gctgatatga aagcgctgct caaagcagcg 1020 

agcgaaggta gtgacggtgc ctctctgaca ttcaatggca cagaatatac catcgcaaaa 1080 

gcaactcctg cgacaaccac tccagtagct ccgttaatcc ctggtgggat tacttatcag 1140 

gctacagtga gtaaagatgt agtattgagc gaaaccaaag cggctgccgc gacatcttca 1200 

attaccttta attccggtgt actgagcaaa actattgggt ttaccgcggg tgaatccagt 1260 

gatgctgcga agtcttatgt ggatgataaa ggtggtatca ctaacgttgc cgactataca 1320 

gtctcttaca gcgttaacaa ggataacggc tctgtgactg ttgccgggta tgcttcagcg 1380 

actgatacca ataaagatta tgctccagca attggtactg ctgtaaatgt gaactccgcg 1440 

ggtaaaatca ctactgagac taccagtgct ggttctgcaa cgaccaaccc gcttgctgcc 1500 

ctggacgacg caatcagctc catcgacaaa ttccgttctt ccctgggtgc tatccagaac 1560 
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cgtctggatt ccgcagtcac caacctgaac aacaccacta ccaacctgtc cgaagcgcag 1620 

tcccgtattc aggacgccga ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatc 1680 

attcagcagg ccggtaactc cgtgctggca aaagctaacc aggtaccgca gcaggttctg 1740 
tctctgctgc agggttaa 1758 

<210> 61 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



<400> 61 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
acagggacta actccgattc tgacctggac 
gatgaaattg accgcgtatc cggccagacc 
gacggttcaa tgaaaattca ggttggtgcg 
aaaaaaatcg attctgatac tctgggtctg 
attaccaaca aagctgcaac ggtaagtgat 
acgacaggtc tttatgatct gaaaaccgaa 
gataaattag ggaatggcga taaagtcaca 
aaatctggtg attttactac cactaaatct 
caggctgctg attcagcttc aaaacgtgat 
ggtaaatctg ttaatggttc ttacaccaca 
tcagcaggta atatcaccat cggtggaagc 
acgactaaca acgctggtag cgcagctaaa 
agcgaaggta gtgacggtgc ctctctgaca 
gcaactcctg cgacaaccac tccagtagct 
gctacagtga gtaaagatgt agtattgagc 
attaccttta attccggtgt actgagcaaa 
gatgctgcga agtcttatgt ggatgataaa 
gtctcttaca gcgttaacaa ggataacggc 
actgatacca ataaagatta tgctccagca 
ggtaaaatca ctactgagac taccagtgct 
ctggacgacg caatcagctc catcgacaaa 
cgtctggatt ccgcggtcac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
atccagcagg ccggtaactc cgtgctggca 
tctctgctgc agggttaa 

<210> 62 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcgcagac caccgaaggc 240 
cgtattcgtg aactgacggt tcaggccact 300 
tccatccagg acgaaatcaa atctcgtctt 360 
cagttcaacg gcgtgaacgt gctggcgaaa 420 
aatgacggcg aaaccatcac gatcgacctg 480 
aatggcttta acgtaaatgg taaaggtact 540 
ttaacttctg ctggcgcgaa gttaaacacc 600 
aataccttgt taactaccga tgctgcattc 660 
gttggcggcg tagattatac ttacaacgct 720 
actgctggta cgggtgtaga cgccgcggcg 780 
gcgttagctg ccacccttca tgctgatgtg 840 
aaagatggta ctgtttcttt cgaaacggat 900 
caggcatacg tagacgatgc aggcaacttg 960 
gctgatatga aagcgctgct caaagcagcg 1020 
ttcaatggca cagaatatac catcgcaaaa 1080 
ccgttaatcc ctggtgggat tacttatcag 1140 
gaaaccaaag cggctgccgc gacatcttca 1200 
actattgggt ttaccgcggg tgaatccagt 1260 
ggtggtatca ctaacgttgc cgactataca 1320 
tctgtgactg ttgccgggta tgcttcagcg 13 80 
attggcactg ctgtaaatgt gaactccgcg 1440 
ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ttccgttctt ccctgggtgc tatccagaac 1560 
aacaccacta ccaacctgtc cgaagcgcag 1620 
gaagtgtcca acatgtcgaa agcgcagatc 1680 
aaagctaacc aggtaccgca gcaggttctg 1740 

1758 



<400> 62 
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atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 

aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 

gcgaaggatg acgccgcggg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 

ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcacagac cactgaaggc 240 

gcgctgtccg aaatcaacaa caacttacag cgtatccgtg agctgacggt tcaggcttct 300 

accgggacta actctgattc ggatctggac tccattcagg acgaaatcaa atcccgtctc 360 

gacgaaattg accgcgtatc cggtcagacc cagttcaacg gcgtgaacgt actggcaaaa 420 

gacggttcga tgaaaattca ggttggtgcg aatgacggtg aaactatcac tatcgacctg 480 

aagaaaatcg attctgatac tctgggtctg aatggtttta acgtaaatgg taaaggtact 540 

attaccaaca aagctgcaac ggtaagtgat ttaacttctg ctggcgcgaa gttaaacacc 600 

acgacaggtc tttatgatct gaaaaccgaa aataccttgt taactaccga tgctgcattc 660 

gataaattag ggaatggcga taaagtcacc gttggcggcg tagattatac ttacaacgct 720 

aaatctggtg attttactac caccaaatct actgctggta cgggtgtaga cgccgcggcg 780 

caggctactg attcagctaa aaaacgtgat gcgttagctg ccacccttca tgctgatgtg 840 

ggtaaatctg ttaatggttc ttacaccaca aaagatggta ctgtttcttt cgaaacggat 900 

tcagcaggta atatcaccat cggtggaagc caggcatacg tagacgatgc aggcaacttg 960 

acgactaaca acgctggtag cgcagctaaa gctgatatga aagcgctgct taaagccgcg 1020 

agcgaaggta gtgacggtgc ctctctgaca ttcaatggca ctgaatatac tatcgcaaaa 1080 

gcaactcctg cgacaacctc tccagtagct ccgttaatcc ctggtgggat ttcttatcag 1140 

gctacagtga gtaaagatgt agtattgagc gaaaccaaag cggctgccgc gacatcttca 1200 

attaccttta attccggtgt actgagcaaa actattgggt ttaccgcggg tgaatccagt 1260 

gatgctgcga agtcttatgt ggatgataaa ggtggtatta ctaacgttgc cgactataca 132 0 

gtctcttaca gcgttaacaa ggataacggc tctgtgactg ttgccgggta tgcttcagcg 13 80 

actgatacca ataaagatta tgctccagca attggtactg ctgtaaatgt gaactccgcg 1440 

ggtaaaatca ctactgagac taccagtgct ggttctgcaa cgaccaaccc gcttgctgcc 1500 

ctggacgacg ctatcagctc catcgacaaa ttccgttctt ccctgggtgc tatccagaac 1560 

cgtctggatt ccgcagtcac caacctgaac aacaccacta ccaacctgtc tgaagcgcag 1620 

tcccgtattc aggacgccga ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatt 1680 

atccagcagg ccggtaactc cgtgctggca aaagccaacc aggtaccgca gcaggttctg 1740 
tctctgctgc agggttaa 1758 

<210> 63 
<211> 1758 
<212> DNA 

<213> Escherichia coli 
<400> 63 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 

aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 

gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 

ctgactcagg cggcccgtaa cgccaacgac ggtatttctg ttgcgcagac caccgaaggc 240 

gcgctgtccg aaatcaacaa caacttacag cgtattcgtg aactgacggt tcaggccact 3 00 

acagggacta actccgattc tgacctggac tccatccagg acgaaatcaa atctcgtctt 3 60 

gatgaaattg accgcgtatc cggccagacc cagttcaacg gcgtgaacgt gctggcgaaa 420 

gacggttcaa tgaaaattca ggttggtgcg aatgacggcg aaaccatcac gatcgacctg 480 

aaaaaaatcg attctgatac tctgggtctg aatggcttta acgtaaatgg taaaggtact 540 

attaccaaca aagctgcaac ggtaagtgat ttaacttctg ctggcgcgaa gttaaacacc 600 

acgacaggtc tttatgatct gaaaaccgaa aataccttgt taactaccga tgctgcattc 660 
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gataaattag ggaatggcga taaagtcaca gttggcggcg tagattatac ttacaacgct 72 0 
aaatctggtg attttactac cactaaatct actgctggta cgggtgtaga cgccgcggcg 7 80 
caggctgctg attcagcttc aaaacgtgat gcgttagctg ccacccttca tgctgatgtg 840 
ggtaaatctg ttaatggttc ttacaccaca aaagatggta ctgtttcttt cgaaacggat 900 
tcagcaggta atatcaccat cggtggaagc caggcatacg tagacgatgc aggcaacttg 960 
acgactaaca acgctggtag cgcagctaaa gctgatatga aagcgctgct caaagcagcg 102 0 
agcgaaggta gtgacggtgc ctctctgaca ttcaatggca cagaatatac catcgcaaaa 1080 
gcaactcctg cgacaaccac tccagtagct ccgttaatcc ctggtgggat tacttatcag 1140 
gctacagtga gtaaagatgt agtattgagc gaaaccaaag cggctgccgc gacatcttca 1200 
attaccttta attccggtgt actgagcaaa actattgggt ttaccgcggg tgaatccagt 12 60 
gatgctgcga agtcttatgt ggatgataaa ggtggtatca ctaacgttgc cgactataca 1320 
gtctcttaca gcgttaacaa ggataacggc tctgtgactg ttgccgggta tgcttcagcg 13 80 
actgatacca ataaagatta tgctccagca attggtactg ctgtaaatgt gaactccgcg 1440 
ggtaaaatca ctactgagac taccagtgct ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ctggacgacg caatcagctc catcgacaaa ttccgttctt ccctgggtgc tatccagaac 1560 
cgtctggatt ccgcagtcac caacctgaac aacaccacta ccaacctgtc cgaagcgcag 1620 
tcccgtattc aggacgccga ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatc 1680 
attcagcagg ccggtaactc cgtgctggca aaagctaacc aggtaccgca gcaggttctg 1740 
tctctgctgc agggttaa 1758 

<210> 64 
<211> 1758 
<212> DNA 

<213> Escherichia coli 
<400> 64 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 12 0 
gcgaaggatg acgccgcggg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcacagac caccgaaggc 240 
gcgctgtctg aaatcaacaa caacttacag cgtatccgtg agctgacggt tcaggcttct 3 00 
accggaacta actctgattc ggatctggac tccattcagg acgaaatcaa atcccgtctt 3 60 
gatgaaattg accgcgtatc cggccagacc cagttcaacg gcgtgaacgt actggcaaaa 420 
gacggttcga tgaaaattca ggttggtgcg aatgacggtg aaactatcac tatcgacctg 480 
aagaaaatcg attctgatac tctgggtctg aatggtttta acgtaaatgg taaaggtact 540 
attaccaaca aagctgcaac ggtaagtgat ttaacttctg ctggcgcgaa gttaaacacc 600 
acgacaggtc tttatgatct gaaaaccgaa aataccttgt taactaccga tgctgcattc 660 
gataaattag ggaatggcga taaagtcacc gttggcggcg tagattatac ttacaacgct 720 
aaatctggtg attttactac caccaaatct actgctggta cgggtgtaga cgccgcggcg 780 
caggctactg attcagctaa aaaacgtgat gcgttagctg ccacccttca tgctgatgtg 840 
ggtaaatctg ttaatggttc ttacaccaca aaagatggta ctgtttcttt cgaaacggat 900 
tcagcaggta atatcaccat cggtgga^agc caggcatacg tagacgatgc aggcaacttg 96 0 
acgactaaca acgctggtag cgcagctaaa gctgatatga aagcgctgct taaagccgcg 1020 
agcgaaggta gtgacggtgc ttctctgaca ttcaatggca ctgaatatac tatcgcaaaa 1080 
gcaactcctg cgacaacctc tccagtagct ccgttaatcc ctggtgggat tacttatcag 1140 
gctacagtga gtaaagatgt agtattgagc gaaaccaaag cggctgccgc gacatcttca 1200 
attaccttta attccggtgt actgagcaaa actattgggt ttaccgcggg tgaatccagt 1260 
gatgctgcga agtcttatgt ggatgataaa ggtggtatta ctaacgttgc cgactataca 1320 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 



# 




wo 99/61458 ^ ^ PCT/AU99/00385 

- Do - 

gtctcttaca gcgttaacaa ggataacggc tctgtgactg ttgccgggta tgcttcagcg 13 80 
actgatacca ataaagatta tgctccagca attggtactg ctgtaaatgt gaactccgcg 1440 
ggtaaaatca ctactgagac taccagtgct ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ctggacgacg ctatcagctc catcgacaaa ttccgttctt ccctgggtgc tatccagaac 1560 
cgtctggatt ccgcagtcac caacctgaac aacaccacta ccaacctgtc tgaagcgcag 1620 
tcccgtattc aggacgccga ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatt 1680 
atccagcagg ccggtaactc cgtgctggca aaagccaacc aggtaccgca gcaggttctg 1740 
tctctgctgc agggttaa 1758 

<210> 65 
<211> 1758 
<212> DNA 

<213> Escherichia coli 
<400> 65 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcggg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcacagac cactgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtatccgtg agctgacggt tcaggcttct 300 
accgggacta actctgattc ggatctggac tccattcagg acgaaatcaa atcccgtctc 360 
gacgaaattg accgcgtatc cggtcagacc cagttcaacg gcgtgaacgt actggcaaaa 420 
gacggttcga tgaaaattca ggttggtgcg aatgacggtg aaactatcac tatcgacctg 480 
aagaaaatcg attctgatac tctgggtctg aatggtttta acgtaaatgg taaaggtact 540 
attaccaaca aagctgcaac ggtaagtgat ttaacttctg ctggcgcgaa gttaaacacc 600 
acgacaggtc tttatgatct gaaaaccgaa aataccttgt taactaccga tgctgcattc 660 
gataaattag ggaatggcga taaagtcacc gttggcggcg tagattatac ttacaacgct 720 
aaatctggtg attttactac caccaaatct actgctggta cgggtgtaga cgccgcggcg 780 
caggctactg attcagctaa aaaacgtgat gcgttagctg ccacccttca tgctgatgtg 840 
ggtaaatctg ttaatggttc ttacaccaca aaagatggta ctgtttcttt cgaaacggat 900 
tcagcaggta atatcaccat cggtggaagc caggcatacg tagacgatgc aggcaacttg 960 
acgactaaca acgctggtag cgcagctaaa gctgatatga aagcgctgct taaagccgcg 102 0 
agcgaaggta gtgacggtgc ctctctgaca ttcaatggca ctgaatatac tatcgcaaaa 1080 
gcaactcctg cgacaacctc tccagtagct ccgttaatcc ctggtgggat ttcttatcag 1140 
gctacagtga gtaaagatgt agtattgagc gaaaccaaag cggctgccgc gacatcttca 1200 
attaccttta attccggtgt actgagcaaa actattgggt ttaccgcggg tgaatccagt 1260 
gatgctgcga agtcttatgt ggatgataaa ggtggtatta ctaacgttgc cgactataca 1320 
gtctcttaca gcgttaacaa ggataacggc tctgtgactg ttgccgggta tgcttcagcg 13 80 
actgatacca ataaagatta tgctccagca attggtactg ctgtaaatgt gaactccgcg 1440 
ggtaaaatca ctactgagac taccagtgct ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ctggacgacg ctatcagctc catcgacaaa ttccgttctt ccctgggtgc tatccagaac 1560 
cgtctggatt ccgcagtcac caacctgaac aacaccacta ccaacctgtc tgaagcgcag 1620 
tcccgtattc aggacgccga ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatt 1680 
atccagcagg ccggtaactc cgtgctggca aaagccaacc aggtaccgca gcaggttctg 1740 
tctctgctgc agggttaa 1758 

<210> 66 
<211> 1788 
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<212> DNA 

<213> Escherichia coli 



<400> 66 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 

aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 12 0 

gcgaaggatg acgccgcggg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 

ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcacagac cactgaaggc 240 

gcgctgtccg aaatcaacaa caacttacag cgtatccgtg agctgacggt tcaggcttct 3 00 

accgggacta actctgattc ggatctggac tccattcagg acgaaatcaa atcccgtctc 3 60 

gacgaaattg accgcgtatc cggtcagacc cagttcaacg gcgtgaacgt actggcaaaa 42 0 

gacggttcga tgaaaattca ggtaggtgcg aacgacggcc agactatcac tattgatctg 480 

aagaaaattg actctgatac gctggggctg aatggtttta acgtgaatgg ttccggtacg 540 

atagccaata aagcggcgac cattagcgac ctgacagcag cgaaaatgga tgctgcaact 600 

aatactataa ctacaacaaa taatgcgctg actgcatcaa aggcccttga tcaactgaaa 660 

gatggtgaca ctgttactat caaagcagat gcagctcaaa ctgccacggt ctatacatac 72 0 

aatgcatctg ctggtaactt ctcattcagt aatgtatcga ataatacttc agcaaaagca 780 

ggtgatgtag cagctagcct tctcccgccg gctgggcaaa ctgctagtgg tgtttacaaa 840 

gcagcaagcg gtgaagtgaa ctttgatgtt gatgcgaatg gtaaaattac aatcggagga 900 

caggaagcct atttaactag tgatggtaac ttaactacaa acgatgctgg tggtgcgact 960 

gcggctacgc ttgatggttt attcaagaaa gctggtgatg gtcaatcaat cgggtttaat 102 0 

aagactgcat cagtcacgat ggggggaaca acttataact ttaaaacggg tgctgatgct 10 80 

ggtgctgcaa ctgctaacgc aggggtatcg ttcactgata cagctagcaa agaaaccgtt 1140 

ttaaataaag tggctacagc taaacaaggc acagcagttg cagctaacgg tgatacatcc 1200 

gcaacaatta cctataaatc tggcgttcag acgtatcagg cggtatttgc cgcaggtgac 1260 

ggtactgcta gcgcaaaata tgccgataat actgacgttt ctaatgcaac agcaacatac 132 0 

acagatgctg atggtgaaat gactacaatt ggttcataca ccacgaagta ttcaatcgat 13 80 

gctaacaacg gcaaggtaac tgttgattct ggaactggtt cgggtaaata tgcgccgaaa 1440 

gtcggggctg aagtatatgt tagtgctaat ggtactttaa caacagatgc aactagcgaa 1500 

ggcacagtaa caaaagatcc actgaaagct ctggatgaag ctatcagctc catcgacaaa 1560 

ttccgttcat ccctgggggc tatccaaaac cgtttggatt ccgccgtcac caacctgaac 1620 
aacaccacta ccaacctgtc tgaagcgcag tcccgtattc aggacgccga ctatgcgacc 1680 

gaagtgtcca acatgtcgaa agcgcagatt atccagcagg ccggtaactc cgtgctggca 1740 
aaagccaacc aggtaccgca gcaggttctg tctctactgc agggttaa 1788 

<210> 67 
<211> 1398 
<212> DNA 

<213> Escherichia coli 



<400> 67 

aacaaatctc agtcttctct tagctctgct 
aacagcgcaa aagacgatgc agcaggtcag 
aaaggtctga cccaggcttc ccgtaacgca 
gaaggtgcgc tgaatgaaat taacaacaac 
gcaactaacg gtactaactc tgacagtgac 
cgtctgagtg aaattgaccg tgtttctggt 
gcttctgatc aggatatgac tattcaggtt 



attgagcgtc tgtcttctgg tctgcgtatt 60 
gcgattgcta accgttttac ggcaaatatt 120 
aatgatggta tttctgttgc gcagaccact 180 
ctgcagcgta ttcgtgaact ttctgttcag 240 
ctgacctcca tccagtccga aatccagcag 3 00 
cagactcagt ttaacggcgt taaagtgctg 3 60 
ggtgcaaacg acggcgaaac aattactatt 42 0 
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aaactgcagg aaattaattc cgacacactg ggattatctg gttttggtat taaagatcct 480 

actaaattaa aagccgcaac ggctgaaaca acctattttg gatcgacagt taagcttgct 540 

gacgctaata cacttgatgc agatattaca gctacagtta aaggcactac gactccgggc 600 

caacgtgacg gtaatattat gtctgatgct aacggtaagt tgtacgttaa agttgccggt 660 

tcagataaac ccgctgaaaa tggttattat gaagttactg tggaggatga tccgacatct 720 

cctgatgcag gtaagctgaa gctgggggct ctagcgggta cccagcctca agctggtaat 780 

ttaaaggaag tcacaacggt gaaagggaag ggggctattg atgttcagtt gggtactgat 840 

accgcaaccg cttctatcac aggtgcaaaa ctctttaagt tagaagacgc caatggcaaa 900 

gatactggtt catttgcgtt gattggtgat gacggtaaac agtatgcagc gaatgttgat 960 

cagaaaacag gagcagtttc cgttaaaaca atgtcttaca ctgatgctga cggtgtcaaa 1020 

cacgacaatg ttaaagttga actgggtgga agcgatggca aaaccgaagt tgtaactgca 1080 

accgatggca aaacttacag tgttagtgat ttacaaggta agagcctgaa aactgattct 1140 

attgcagcaa tttctacgca gaaaacagaa gatcctttgg ctgctatcga taaagcactg 1200 

tctcaggttg actcgttgcg ttctaaccta ggtgcaattc aaaatcgttt cgactctgcc 1260 

atcaccaacc ttggcaacac cgtaaacaac ctgtcttctg cccgtagccg tatcgaagat 1320 

gctgactacg cgaccgaagt gtctaacatg tctcgtgcgc agatcctgca acaagcgggt 13 80 

acctctgttc tggcgcag 1398 

<210> 68 
<211> 1479 
<212> DNA 

<213> Escherichia coli 
<400> 68 

aacaaatctc agtcttctct gagctccgcc attgaacgtc tctcttctgg cctgcgtatt 60 
aacagtgcta aagatgacgc agcaggtcag gcgattgcta accgttttac agcaaatatt 120 
aaaggtctga ctcaggcttc ccgtaacgcg aatgatggta tttctgttgc gcagaccact 180 
gaaggtgcgc tttctgaaat caacaataac ttacagcgta ttcgtgaatt gtcagtacag 240 
gccactaatg gtacaaactc tgactccgac ctgaattcaa ttcaggatga aattacacaa 300 
cgccttagtg aaattgatcg tgtttctaac cagacacaat ttaatggtgt aaaagttctg 3 60 
gcttctgatc agactatgaa aattcaagta ggtgcgaacg atggtgaaac cattgagatt 420 
gcccttgata aaattgatgc taaaaccttg gggcttgata actttagcgt agcaccagga 480 
aaagttccaa tgtcctctgc ggttgcactt aagagcgaag ccgctcctga cttaactaag 540 
gtaaatgcaa ctgatggtag tgtgggaggt gctaaagcat tcggtagcaa ttataaaaat 600 
gctgatgttg aaacttattt tggtaccggt aatgtacaag atacaaagga tacaactgat 6 60 
gcgaccggta ctgcaggaac aaaagtttat caagtacagg tggaagggca gacttatttt 720 
gttggtcaag ataataatac caacacgaac ggttttacat tattgaaaca aaactctaca 780 
ggttatgaaa aagttcaggt gggtggtaag gatgttcagt tagcaaactt tggtggtcgt 840 
gtaactgcat ttgttgaaga taatggttct gccacatcag ttgatttagc tgcgggtaaa 900 
atgggtaaag cattagctta taatgatgca ccaatgtctg tttattttgg gggaaaaaac 960 
ctagatgtcc accaagtaca agatacccaa gggaatcctg tacctaattc atttgctgct 1020 
aaaacatcag acggcaccta cattgcagta aatgtagatg ccgctacagg taacacgtct 1080 
gttattactg atcctaatgg taaggcagtt gaatgggcag taaaaaatga tggttctgca 1140 
caggcaatta tgcgtgaaga tgataaggtt tatacagcca atatcacgaa taagacggca 1200 
accaaaggtg ctgaactcag tgcctcagat ttgaaagcct tagcaaccac aaatccatta 1260 
tccacattag acgaagcttt ggcaaaagtt gataagttgc gcagttcttt gggtgcagta 1320 
caaaaccgtt tcgactctgc catcaccaac cttggcaaca ccgtaaacaa cctgtcttct 13 80 
gcccgtagcc gtatagaaga tgctgactac gcaaccgaag tgtctaacat gtctcgtgcg 1440 
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cagatcctgc aacaagcggg tacctctgtt ctggcacag I479 
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